the output (buffer) of an FFT UGen which transforms the audio input to track. For the FFT chain, you should typically use a frame size of 512 or 1024 (at 44.1 kHz sampling rate) and 50% hop size (which is the default setting in SC). For different sampling rates choose an FFT size to cover a similar time-span (around 10 to 20 ms).
the detection threshold, typically between 0 and 1, although in rare cases you may find values outside this range useful
index of a function to be used to analyse the signal. See main paragraph for possible values (usually can be left to default).
(advanced setting) Specifies the time (in seconds) for the normalisation to "forget" about a recent onset. If you find too much re-triggering (e.g. as a note dies away unevenly) then you might wish to increase this value. Not used with "mkl".
(advanced setting) This is a lower limit, connected to the idea of how quiet the sound is expected to get without becoming indistinguishable from noise. For some cleanly-recorded classical music with wide dynamic variations, it was found helpful to go down as far as 1e-6. Not used with "mkl".
(advanced setting) Specifies a minimum gap (in FFT frames) between onset detections, a brute-force way to prevent too many doubled detections.
(advanced setting) Specifies the size (in FFT frames) of the median window used for smoothing the detection function before triggering.
(advanced setting) ?
(advanced setting) ? (init-time only)
An onset detecting UGen for musical audio signals. It detects the beginning of notes/drumbeats/etc. Outputs a control-rate trigger signal which is 1 when an onset is detected, and 0 otherwise.
The onset detection should work well for a general range of monophonic and polyphonic audio signals. The onset detection is purely based on signal analysis and does not make use of any "top-down" inferences such as tempo.
There are different functions available for the analysis:
- 0 "power" -- generally OK, good for percussive input, and also very efficient - 1 "magsum" -- generally OK, good for percussive input, and also very efficient - 2 "complex" -- performs generally very well, but more CPU-intensive - 3 "rcomplex" (default) -- performs generally very well, and slightly more efficient than "complex" - 4 "phase" -- generally good, especially for tonal input, medium efficiency - 5 "wphase" -- generally very good, especially for tonal input, medium efficiency - 6 "mkl" -- generally very good, medium efficiency, pretty different from the other methods
The differences aren't large, so it is recommended you stick with the default "rcomplex" unless you find specific problems with it. Then maybe try "wphase". The "mkl" type is a bit different from the others so maybe try that too. They all have slightly different characteristics, and in tests perform at a similar quality level.
PV_HainsworthFoote
PV_JensenAndersen