-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
/lib/libsnowboy-detect.a | ||
snowboy-detect-swig.cc | ||
snowboydetect.py | ||
|
||
*.pyc | ||
*.o | ||
*.so |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
# Snowboy Hotword Detection | ||
|
||
by [KITT.AI](http://kitt.ai). | ||
|
||
[Home Page](https://snowboy.kitt.ai) | ||
|
||
[Full Documentation](https://snowboy.kitt.ai/docs) | ||
|
||
|
||
Version: 1.0.0 (5/10/2016) | ||
|
||
Snowboy is a customizable hotword detection engine for you to create your own | ||
hotword like "OK Google" or "Alexa". It is powered by deep neural networks and has the following properties: | ||
|
||
* **highly customizable**: you can freely define your own magic phrase here – | ||
let it be “open sesame”, “garage door open”, or “hello dreamhouse”, you name it. | ||
|
||
* **always listening** but protects your privacy: Snowboy does not use Internet and does *not* stream your voice to the cloud. | ||
|
||
* light-weight and **embedded**: it even runs on a Raspberry Pi and consumes less than 10% CPU on the weakest Pi (single-core 700MHz ARMv6). | ||
|
||
* Apache licensed! | ||
|
||
Currently Snowboy supports: | ||
|
||
* all versions of Raspberry Pi (with Raspbian based on Debian Jessie 8.0) | ||
* 64bit Mac OS X | ||
* 64bit Ubuntu (12.04 and 14.04) | ||
|
||
It ships in the form of a **C library** with **Python** wrappers generated by SWIG. We welcome wrappers for other languages -- feel free to send a pull request! | ||
|
||
If you want support on other hardware/OS, please send your request to [[email protected]](mailto:snowboy.kitt.ai) | ||
|
||
|
||
## Dependencies | ||
|
||
Snowboy's Python wrapper uses PortAudio to access your device's microphone. | ||
|
||
### Mac OS X | ||
|
||
`brew` install `swig`, `sox`, `portaudio` and its Python binding `pyaudio`: | ||
|
||
brew install swig portaudio sox | ||
pip install pyaudio | ||
|
||
If you don't have Homebrew installed, please download it [here](http://brew.sh/). If you don't have `pip`, you can install it [here](https://pip.pypa.io/en/stable/installing/). | ||
|
||
Make sure that you can record audio with your microphone: | ||
|
||
rec t.wav | ||
|
||
### Ubuntu | ||
|
||
First `apt-get` install `swig`, `sox`, `portaudio` and its Python binding `pyaudio`: | ||
|
||
sudo apt-get install swig3.0 python-pyaudio python3-pyaudio sox | ||
pip install pyaudio | ||
|
||
Then install the `atlas` matrix computing library: | ||
|
||
sudo apt-get install libatlas-base-dev | ||
|
||
Make sure that you can record audio with your microphone: | ||
|
||
rec t.wav | ||
If you need extra setup on your audio (especially on a Raspberry Pi), please see the [full documentation](https://snowboy.kitt.ai/docs). | ||
|
||
## Compile a Python Wrapper | ||
|
||
cd swig/python | ||
make | ||
|
||
SWIG will generate a `_snowboydetect.so` file and a simple (but hard-to-read) python wrapper `snowboydetect.py`. We have provided a higher level python wrapper `snowboydecoder.py` on top of that. | ||
|
||
Feel free to adapt the `Makefile` in `swig/python` to your own system's setting if you cannot `make` it. | ||
|
||
|
||
## Quick Start | ||
|
||
Go to the `swig/python` folder and open your python console: | ||
|
||
In [1]: import snowboydecoder | ||
|
||
In [2]: def detected_callback(): | ||
....: print "hotword detected" | ||
....: | ||
|
||
In [3]: detector = snowboydecoder.HotwordDetector("resources/snowboy.umdl", sensitivity=0.5, audio_gain=1) | ||
|
||
In [4]: detector.start(detected_callback) | ||
|
||
Then speak "snowboy" to your microphone to see whetheer Snowboy detects you. | ||
|
||
The `snowboy.umdl` file is a "universal" model that detect different people speaking "snowboy". If you want other hotwords, please go to [snowboy.kitt.ai](https://snowboy.kitt.ai) to record, train and downloand your own personal model (a `.pmdl` file). | ||
|
||
When `sensitiviy` is higher, the hotword gets more easily triggered. But you might get more false alarms. | ||
|
||
`audio_gain` controls whether to increase (>1) or decrease (<1) input volume. | ||
|
||
Two demo files `demo.py` and `demo2.py` are provided to show more usages. | ||
|
||
Note: if you see the following error: | ||
|
||
TypeError: __init__() got an unexpected keyword argument 'model_str' | ||
|
||
You are probably using an old version of SWIG. Please upgrade. We have tested with SWIG version 3.0.7 and 3.0.8. | ||
|
||
## Advanced Usages & Demos | ||
|
||
See [Full Documentation](https://snowboy.kitt.ai/docs). | ||
|
||
## Change Log | ||
|
||
**5/10/2016** | ||
|
||
* initial release |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
// include/snowboy-detect.h | ||
|
||
// Copyright 2016 KITT.AI (author: Guoguo Chen) | ||
|
||
#ifndef SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_ | ||
#define SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_ | ||
|
||
#include <memory> | ||
#include <string> | ||
|
||
namespace snowboy { | ||
|
||
// Forward declaration. | ||
struct WaveHeader; | ||
class PipelineDetect; | ||
|
||
//////////////////////////////////////////////////////////////////////////////// | ||
// | ||
// SnowboyDetect class interface. | ||
// | ||
//////////////////////////////////////////////////////////////////////////////// | ||
class SnowboyDetect { | ||
public: | ||
// Constructor that takes a resource file, and a list of hotword models which | ||
// are separated by comma. In the case that more than one hotword exist in the | ||
// provided models, RunDetection() will return the index of the hotword, if | ||
// the corresponding hotword is triggered. | ||
// | ||
// CAVEAT: a personal model only contain one hotword, but an universal model | ||
// may contain multiple hotwords. It is your responsibility to figure | ||
// out the index of the hotword. For example, if your model string is | ||
// "foo.pmdl,bar.umdl", where foo.pmdl contains hotword x, bar.umdl | ||
// has two hotwords y and z, the indices of different hotwords are as | ||
// follows: | ||
// x 1 | ||
// y 2 | ||
// z 3 | ||
// | ||
// @param [in] resource_filename Filename of resource file. | ||
// @param [in] model_str A string of multiple hotword models, | ||
// separated by comma. | ||
SnowboyDetect(const std::string& resource_filename, | ||
const std::string& model_str); | ||
|
||
// Resets the detection. This class handles voice activity detection (VAD) | ||
// internally. But if you have an external VAD, you should call Reset() | ||
// whenever you see segment end from your VAD. | ||
bool Reset(); | ||
|
||
// Runs hotword detection. Supported audio format is WAVE (with linear PCM, | ||
// 8-bits unsigned integer, 16-bits signed integer or 32-bits signed integer). | ||
// See SampleRate(), NumChannels() and BitsPerSample() for the required | ||
// sampling rate, number of channels and bits per sample values. You are | ||
// supposed to provide a small chunk of data (e.g., 0.1 second) each time you | ||
// call RunDetection(). Larger chunk usually leads to longer delay, but less | ||
// CPU usage. | ||
// | ||
// Definition of return values: | ||
// -1: Error. | ||
// 0: No event. | ||
// 1: Hotword 1 triggered. | ||
// 2: Hotword 2 triggered. | ||
// ... | ||
// | ||
// @param [in] data Small chunk of data to be detected. See | ||
// above for the supported data format. | ||
int RunDetection(const std::string& data); | ||
|
||
// Sets the sensitivity string for the loaded hotwords. A <sensitivity_str> is | ||
// a list of floating numbers between 0 and 1, and separated by comma. For | ||
// example, if there are 3 loaded hotwords, your string should looks something | ||
// like this: | ||
// 0.4,0.5,0.8 | ||
// Make sure you properly align the sensitivity value to the corresponding | ||
// hotword. | ||
void SetSensitivity(const std::string& sensitivity_str); | ||
This comment has been minimized.
Sorry, something went wrong.
This comment has been minimized.
Sorry, something went wrong.
chenguoguo
via email
Collaborator
|
||
|
||
// Returns the sensitivity string for the current hotwords. | ||
std::string GetSensitivity() const; | ||
|
||
// Applied a fixed gain to the input audio. In case you have a very weak | ||
// microphone, you can use this function to boost input audio level. | ||
void SetAudioGain(const float audio_gain); | ||
This comment has been minimized.
Sorry, something went wrong.
NicoHood
|
||
|
||
// Writes the models to the model filenames specified in <model_str> in the | ||
// constructor. This overwrites the original model with the latest parameter | ||
// setting. You are supposed to call this function if you have updated the | ||
// hotword sensitivities through SetSensitivity(), and you would like to store | ||
// those values in the model as the default value. | ||
void UpdateModel() const; | ||
|
||
// Returns the number of the loaded hotwords. This helps you to figure the | ||
// index of the hotwords. | ||
int NumHotwords() const; | ||
|
||
// Returns the required sampling rate, number of channels and bits per sample | ||
// values for the audio data. You should use this information to set up your | ||
// audio capturing interface. | ||
int SampleRate() const; | ||
int NumChannels() const; | ||
int BitsPerSample() const; | ||
|
||
~SnowboyDetect(); | ||
|
||
private: | ||
std::unique_ptr<WaveHeader> wave_header_; | ||
std::unique_ptr<PipelineDetect> detect_pipeline_; | ||
}; | ||
|
||
} // namespace snowboy | ||
|
||
#endif // SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# Example Makefile that converts snowboy c++ library (snowboy-detect.a) to | ||
# python library (_snowboydetect.so, snowboydetect.py), using swig. | ||
|
||
# Some versions of swig does not work well. We prefer compiling swig from source | ||
# code. We have tested swig-3.0.7.tar.gz. | ||
SWIG := swig | ||
|
||
SNOWBOYDETECTSWIGITF = snowboy-detect-swig.i | ||
SNOWBOYDETECTSWIGOBJ = snowboy-detect-swig.o | ||
SNOWBOYDETECTSWIGCC = snowboy-detect-swig.cc | ||
SNOWBOYDETECTSWIGLIBFILE = _snowboydetect.so | ||
|
||
TOPDIR := ../../ | ||
CXXFLAGS := -I$(TOPDIR) -O3 -fPIC | ||
LDFLAGS := | ||
|
||
ifeq ($(shell uname), Darwin) | ||
CXX := clang++ | ||
PYINC := $(shell /usr/bin/python2.7-config --includes) | ||
PYLIBS := $(shell /usr/bin/python2.7-config --ldflags) | ||
SWIGFLAGS := -bundle -flat_namespace -undefined suppress | ||
LDLIBS := -lm -ldl -framework Accelerate | ||
SNOWBOYDETECTLIBFILE = $(TOPDIR)/lib/osx/libsnowboy-detect.a | ||
else | ||
CXX := g++ | ||
PYINC := $(shell python-config --cflags) | ||
PYLIBS := $(shell python-config --ldflags) | ||
SWIGFLAGS := -shared | ||
CXXFLAGS += -std=c++0x | ||
# Make sure you have Atlas installed. You can statically link Atlas if you | ||
# would like to be able to move the library to a machine without Atlas. | ||
LDLIBS := -lm -ldl -lf77blas -lcblas -llapack_atlas -latlas | ||
SNOWBOYDETECTLIBFILE = $(TOPDIR)/lib/ubuntu64/libsnowboy-detect.a | ||
endif | ||
|
||
all: $(SNOWBOYSWIGLIBFILE) $(SNOWBOYDETECTSWIGLIBFILE) | ||
|
||
%.a: | ||
$(MAKE) -C ${@D} ${@F} | ||
|
||
$(SNOWBOYDETECTSWIGCC): $(SNOWBOYDETECTSWIGITF) | ||
$(SWIG) -I$(TOPDIR) -c++ -python -o $(SNOWBOYDETECTSWIGCC) $(SNOWBOYDETECTSWIGITF) | ||
|
||
$(SNOWBOYDETECTSWIGOBJ): $(SNOWBOYDETECTSWIGCC) | ||
$(CXX) $(PYINC) $(CXXFLAGS) -c $(SNOWBOYDETECTSWIGCC) | ||
|
||
$(SNOWBOYDETECTSWIGLIBFILE): $(SNOWBOYDETECTSWIGOBJ) $(SNOWBOYDETECTLIBFILE) | ||
$(CXX) $(CXXFLAGS) $(LDFLAGS) $(SWIGFLAGS) $(SNOWBOYDETECTSWIGOBJ) \ | ||
$(SNOWBOYDETECTLIBFILE) $(PYLIBS) $(LDLIBS) -o $(SNOWBOYDETECTSWIGLIBFILE) | ||
|
||
clean: | ||
-rm -f *.o *.a *.so snowboydetect.py *.pyc $(SNOWBOYDETECTSWIGCC) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
import snowboydecoder | ||
import sys | ||
import signal | ||
|
||
interrupted = False | ||
|
||
|
||
def signal_handler(signal, frame): | ||
global interrupted | ||
interrupted = True | ||
|
||
|
||
def interrupt_callback(): | ||
global interrupted | ||
return interrupted | ||
|
||
if len(sys.argv) == 1: | ||
print("Error: need to specify model name") | ||
print("Usage: python demo.py your.model") | ||
sys.exit(-1) | ||
|
||
model = sys.argv[1] | ||
|
||
# capture SIGINT signal, e.g., Ctrl+C | ||
signal.signal(signal.SIGINT, signal_handler) | ||
|
||
detector = snowboydecoder.HotwordDetector(model, sensitivity=0.5) | ||
print('Listening... Press Ctrl+C to exit') | ||
|
||
# main loop | ||
detector.start(detected_callback=snowboydecoder.play_audio_file, | ||
interrupt_check=interrupt_callback, | ||
sleep_time=0.03) | ||
|
||
detector.terminate() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
import snowboydecoder | ||
import sys | ||
import signal | ||
|
||
# Demo code for listening two hotwords at the same time | ||
|
||
interrupted = False | ||
|
||
|
||
def signal_handler(signal, frame): | ||
global interrupted | ||
interrupted = True | ||
|
||
|
||
def interrupt_callback(): | ||
global interrupted | ||
return interrupted | ||
|
||
if len(sys.argv) != 3: | ||
print("Error: need to specify 2 model names") | ||
print("Usage: python demo.py 1st.model 2nd.model") | ||
sys.exit(-1) | ||
|
||
models = sys.argv[1:] | ||
|
||
# capture SIGINT signal, e.g., Ctrl+C | ||
signal.signal(signal.SIGINT, signal_handler) | ||
|
||
sensitivity = [0.5]*len(models) | ||
detector = snowboydecoder.HotwordDetector(models, sensitivity=sensitivity) | ||
callbacks = [lambda: snowboydecoder.play_audio_file(snowboydecoder.DETECT_DING), | ||
lambda: snowboydecoder.play_audio_file(snowboydecoder.DETECT_DONG)] | ||
print('Listening... Press Ctrl+C to exit') | ||
|
||
# main loop | ||
# make sure you have the same numbers of callbacks and models | ||
detector.start(detected_callback=callbacks, | ||
interrupt_check=interrupt_callback, | ||
sleep_time=0.03) | ||
|
||
detector.terminate() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
PyAudio==0.2.9 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../../resources/ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
// swig/snowboy-detect-swig.i | ||
|
||
// Copyright 2016 KITT.AI (author: Guoguo Chen) | ||
|
||
%module snowboydetect | ||
|
||
// Suppress SWIG warnings. | ||
#pragma SWIG nowarn=SWIGWARN_PARSE_NESTED_CLASS | ||
%include "std_string.i" | ||
|
||
%{ | ||
#include "include/snowboy-detect.h" | ||
%} | ||
|
||
%include "include/snowboy-detect.h" |
Wouldnt it make more sense to pass the values as actual floats via variadic function? Since its c++ you could simply overload this function and keep the old method and depreciate it slowly?
http://en.cppreference.com/w/cpp/utility/variadic
The equivalent GetSensitivity() method could make use of an input parameter index instead of returning a long string of all sensitivities. Or maybe there are even better c++ solutions, those are the ones I know.
cc @chenguoguo