Support for nurses as AI detects prematures crying in a NICU with single rooms 


One in ten babies are born premature. Prematures have an increased risk to develop diseases. Critical neonates are often admitted to a Neonatal Intensive Care Unit (NICU) and held in an incubator under supervision of phycisians and nurses. Nurses need to watch over the child and be ready to care for the baby at any time. With the increase in amount of admitted babies and pressure on staff availability, this can be hard to manage. 

Just like normal babies, prematures cry to tell something is wrong. And human as we are, we respond to that. As do the nurses at the intensive care. But sometimes human nature gets hindered unexpectedly by new developments that have the best intentions: Meet ‘single room care’, where babies are cared for in a dedicated private room. With better controls and less stress for the child, and more comfortable for parents. Many hospitals are stepping over to single rooms. But that gives peculiar problems for monitoring and watching over the child. For example: “how to hear when a baby is crying?” 

AI Crying detection

Detecting a cry in a NICU room is not as easy as it seems. While humans are very perceptive to babies crying (by design through evolution), for an algorithm, it would just be another recorded sound just like any other. Additionally, the room holding the incubator has a constant background noise of up to 50 dB, with medical equipment giving alarm beeps and other sound signals, as well as people in the room that might be talking. On top of that, a microphone is preferably located outside of the incubator, making the baby crying often much lower in volume than the background sounds. 

Therefore, a simple noise level detector would not cut it for this application. Nor do filtering techniques with banding. Instead, Neolook designed a machine learning algorithm, to analyse the data stream live, and raise an alert whenever crying was detected. 

For this approach, live audio is first converted into a spectrogram. It is a type of image that can show you how the frequencies in an audio signal behave over time. An example can be seen below:


And then it is the task of the machine learning algorithm to use this low-resolution image, and determine if it represents a baby crying or not. To do this on scale for a fully operational NICU, an advanced convolutional neural network (CNN) has been set up, and trained using 10.753 examples of no crying, and 3.092 examples of baby crying. These samples were all recorded within the live situation, as to mimic the final environment as close as possible. Obtaining examples of a baby crying is harder than it might seem, because the recorded data will mostly consist of other sounds, while the baby is sleeping in the incubator. 

After training the algorithm, it achieved an accuracy of 94.72%. This is lower than the optimal value, but special care was taken with preprocessing and post processing to make the algorithm more selective; that is to say, the algorithm needs to sure it is a baby crying, before it actually sends an alert. This extra design is to prevent ‘alarm fatigue’. Alarm fatigue is a problem where the nurses start to ignore the alarms raised because too many of them happen. This can happen when the algorithm sends too many false alerts: it will say a baby is crying, while there is no crying happening. Therefore, it is better to have an optimally balanced trade off performance to favour reliability.