emergenceai/kotlin_speech_features
Kotlin Speech Features
Integration
Add jitpack.io to your project's repositories:
allProjects {
repositories {
google()
maven { url 'https://jitpack.io' }
}
}Add the dependency:
dependencies {
implementation "com.github.MerlynMind:kotlin_speech_features:${version}"
}Example implementation
A sample app is included in this repo to help understand the implementation.
- Convert your audio signal in the form of a float array. (A demo provided in the sample app)
- Initialize speech features
``kotlin private val speechFeatures = SpeechFeatures() ``
- Perform any of the 4 operations:
``kotlin val result = speechFeatures.mfcc(MathUtils.normalize(wav), nFilt = 64) val result = speechFeatures.fbank(MathUtils.normalize(wav), nFilt = 64) val result = speechFeatures.logfbank(MathUtils.normalize(wav), nFilt = 64) val result = speechFeatures.ssc(MathUtils.normalize(wav), nFilt = 64) ``
- The result will contain metrices with the expected features. Pass in these features for further processes (e.g. classification, speech recognition).
--- </details>
<details> <summary> iOS </summary>
Integration
- In XCode, go to
File > Add Packages... - Paste in the URL of this repo in the search box
- Select the package found
- Click
Add Packagebutton
Example implementation
A sample app is included in this repo to help understand the implementation.
- Convert your audio signal in the form of an
KotlinIntArrayand normalize it.
```swift import KotlinSpeechFeatures
let signal = Int // Example signal let normalized = MathUtils.Companion.init().normalize(sig: toKotlinIntArray(arr: signal))
func toKotlinIntArray(arr: [Int]) -> KotlinIntArray { let result = KotlinIntArray(size: Int32(arr.capacity)) for i in 0...(arr.count-1) { result.set(index: Int32(i), value: Int32(arr[i])) } return result } ```
- Initialize speech features
``swift let speechFeatures = SpeechFeatures() ``
- Perform any of the 4 operations:
``swift let result = speechFeatures.mfcc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, numCep: 13, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: ni;, preemph: 0.97, ceplifter: 22, appendEnergy: true, winFunc: nil) let result = speechFeatures.fbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil) let result = speechFeatures.logfbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil) let result = speechFeatures.ssc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil) ``
- The result will contain metrices with the expected features. Pass in these features for further processes (e.g. classification, speech recognition).
</details>
<details> <summary> JavaScript </summary>
`` Coming soon... ``
</details>
✍️ Contributing
Interested in contributing to the library? Thank you so much for your interest! We are always looking for improvements to the project and contributions from open-source developers are greatly appreciated.
- Clone repo and create a new branch:
git checkout https://github.com/merlynmind/kotlin_speech_features -b name_for_new_branch- Make changes and test
- Submit Pull Request with comprehensive description of changes
🌟 Spread the word!
If you want to say thank you and/or support active development of this library:
- Add a GitHub Star to the project!
- Tweet about the project on your Twitter!
Tag @MerlynMind and/or #heyMerlnyn
Thank you so much for your interest in growing the reach of our library!
🧡 Credits
- Arjun Sunil - Original Author of kotlin speech features
- Raquib-Ul Alam - For major refactoring and making the code presentable
- Rob Smith - For Mentoring and helping us to navigate through the task
📝 References
- Original library - Python Speech Features
- Reference Library - C Speech Features
- Sample english.wav was obtained from
wget http://voyager.jpl.nasa.gov/spacecraft/audio/english.au
sox english.au -e signed-integer english.wavPackage Metadata
Repository: emergenceai/kotlin_speech_features
Default branch: main
README: README.md