Past Projects

  • Project Name: Deep Learning for Intensive Longitudinal Biomedical Signals and its Health Related AI Applications.
  • Supporter: Japan Society for the Promotion of Science (JSPS), Japan (Acceptance Rate: 10.6%).
  • Run Time: 30.09.2019 – 24.07.2021.
  • Role: Author of Proposal, Principal Investigator, Grant Holder.
  • Partners: The University of Tokyo, Imperial College London, Osaka University, Carnegie Mellon University, University of Augsburg, Shizuoka University.
  • Abstract: This research aims to leverage the power of A.I. for analysing and monitoring the daily behaviour of the patients suffering from psychiatric diseases via the biomedical intensive longitudinal data. We will investigate the state-of-the-art techniques of machine learning, deep learning, and signal processing for their capacity on screening the patients from the healthy control. In addition, we will explore the feasibility to use the paradigm of A.I. to implement an automatic monitoring and evaluation system for subject’s health status by IoT sensor data. The achievements of this research can facilitate the development of smart wearables for building a human-centered A.I. world with a plenty of applications on healthcare and wellbeing.
  • Project Name: HANAMIHeart sound Analysis and its Non-invasive healthcare Applications via Machine Intelligence.
  • Supporter: Zhejiang Lab, China (Acceptance Rate: < 15%).
  • Run Time: 17.08.2019 – 17.08.2020.
  • Role: Author of Proposal, Principal Investigator, Grant Holder
  • Partners: The University of Tokyo, Shenzhen University General Hospital, Imperial College London, Carnegie Mellon University, University of Augsburg.
  • Abstract: This project aims to investigate the feasibility of using machine listening for automatic analysing the heart sound recorded from the smart stethoscope. Firstly, we will collect, establish, and release a publicly accessible heart sound database, which will overcome the challenge of lacking comparable and suistainable open source database in relevant study. Secondly, we will make a comprehensive investigation of the fundamental knowledge (e.g., features, machine learning models, and relationship between the pathological heart sounds and their acoustical properties). Thirdly, both of the traditional machine learning approaches and the state-of-the-art deep learning methods will be studied and compared based on the released database. We hope the studies can attract more attentions from the scientific community of machine learning, signal processing, cardiology, and biomedical engineering.
  • Project Name: Deep Analysis for General Audio Signal Classification.
  • Supporter: TUM Graduate School, TUM-FGZ-EI, CMU.
  • Run Time: 01.2018 – 03.2018.
  • Role: Author of Proposal, Principal Investigator, Grant Holder.
  • Partners: Technical University of Munich, Carnegie Mellon University, Imperial College London, University of Augsburg.
  • Abstract: This project aims to investigate some state-of-the-art deep learning algorithms, e.g., deep belief networks, deep convolutional neural networks, deep recurrent neural networks, generative adversarial networks, on their performances on classification of general audio signals. Some novel deep learned acoustic features from the sounds’ spectrograms, scalograms, will be extracted by transfer learning architectures, or some expert-designed sophisticated networks. Furthermore, these features will improve the final predictions when combined with some conventional temporal and spectral features. The algorithms and models studied in this project can be used for healthcare (e.g., snore/heart sounds), ecology (e.g., bird sounds), and daily life surveillance (e.g., acoustic scenes).
  • Project Name: Development of Just-in-Time Adaptive Intervention for Behavioural Modification based on Continuous Psycho-behavioural Monitoring under Daily Life.
  • Supporter: Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan (Acceptance Rate: 24.8%).
  • Run Time: 01.04.2017 – 31.03.2020.
  • Role: Main Participant.
  • Partners: The University of Tokyo, Fujita Health University, Nagoya City University.
  • Abstract: The purpose of this study is to develop a risk detection type intervention guidance method that conducts behavioural change intervention in Just-in-Time, and to verify its clinical applicability.
  • Project Name: Fast Recognition of Bird Sounds Data by High Performance Computing System.
  • Supporter: TUM Graduate School, Tokyo Tech.
  • Run Time: 04.2016 – 11.2016.
  • Role: Author of Proposal, Principal Investigator, Grant Holder.
  • Partners: Technical University of Munich, Tokyo Institute of Technology, Imperial College London, University of Passau.
  • Abstract: The dramatically growing and expanding big audio data through Internet, videos, and music, among others, are now bringing huge opportunities, meanwhile, with great challenges to relevant research and industry community. In this study, we focus on how to implement the state-of-the-art theories and techniques for audio data classication within a large scale into a high performance computing (HPC) system. As a case study, we utilize the big bird sounds data (includes more than 272, 360 audio recordings, 9, 400 species of birds from the world, approximately 1TB in data size) provided by ‘xeno canto’ with our sophisticated toolkit CURRENNT (building LSTM (Long Short Term Memory) for segmentation of bird syllables) and openSMILE (extracting large scale acoustic features), and MXNET (a kind of deep neural networks (DNNs) platform), to implement an audtomatic big bird sounds detection, learning and classication paradigm in the advanced and energy-friendly TSUBAME 2.5, a HPC system designed and established by Tokyo Tech. This task is aiming to design and implement a feasible deep big audio data learning framework in a supercomputing system. The outputs of this study can also be used into other academic tasks and industrial applications related to big audio data learning and processing.
  • Project Name: Automatic Detection and Classification of Different Snore Related Sounds from Overall Night Audio Recordings.
  • Supporter: NJUST Graduate School, NTU Information Systems Research Laboratory.
  • Run Time: 11.2013 – 03.2014.
  • Role: Author of Proposal, Principal Investigator, Grant Holder.
  • Partners: Nanjing University of Science and Technology, Nanyang Technological University, Beijing Hospital.
  • Abstract: This project aims to establish a whole framework for automatically detecting and classifying different snore related sounds data from overall night microphone audio recordings. The framework includes steps of signal detection, feature extraction, feature selection, and machine learning.
  • Project Name: iHEARuIntelligent Systems’ Holistic Evolving Analysis of Real-life Universal Speaker Characteristics .
  • Supporter: European Research Council (ERC).
  • Run Time: 01.01.2014 – 31.12.2018.
  • Role: Main Participant.
  • Partners: University of Augsburg, University of Passau, Technical University of Munich.
  • Abstract: Recently, automatic speech and speaker recognition has matured to the degree that it entered the daily lives of thousands of Europe’s citizens, e.g., on their smart phones or in call services. During the next years, speech processing technology will move to a new level of social awareness to make interaction more intuitive, speech retrieval more efficient, and lend additional competence to computer-mediated communication and speech-analysis services in the commercial, health, security, and further sectors. To reach this goal, rich speaker traits and states such as age, height, personality and physical and mental state as carried by the tone of the voice and the spoken words must be reliably identified by machines. In the iHEARu project, ground-breaking methodology including novel techniques for multi-task and semi-supervised learning will deliver for the first time intelligent holistic and evolving analysis in real-life condition of universal speaker characteristics which have been considered only in isolation so far. Today’s sparseness of annotated realistic speech data will be overcome by large-scale speech and meta-data mining from public sources such as social media, crowd-sourcing for labelling and quality control, and shared semi-automatic annotation. All stages from pre-processing and feature extraction, to the statistical modelling will evolve in “life-long learning” according to new data, by utilising feedback, deep, and evolutionary learning methods. Human-in-the-loop system validation and novel perception studies will analyse the self-organising systems and the relation of automatic signal processing to human interpretation in a previously unseen variety of speaker classification tasks. The project’s work plan gives the unique opportunity to transfer current world-leading expertise in this field into a new de-facto standard of speaker characterisation methods and open-source tools ready for tomorrow’s challenge of socially aware speech analysis.