emergenceai/kotlin_speech_features

Kotlin Speech Features

Integration

Add jitpack.io to your project's repositories:

allProjects {
  repositories {
    google()
    maven { url 'https://jitpack.io' }
  }
}

Add the dependency:

dependencies {
    implementation "com.github.MerlynMind:kotlin_speech_features:${version}"
}

Example implementation

A sample app is included in this repo to help understand the implementation.

Convert your audio signal in the form of a float array. (A demo provided in the sample app)
Initialize speech features

``kotlin private val speechFeatures = SpeechFeatures() ``

Perform any of the 4 operations:

``kotlin val result = speechFeatures.mfcc(MathUtils.normalize(wav), nFilt = 64) val result = speechFeatures.fbank(MathUtils.normalize(wav), nFilt = 64) val result = speechFeatures.logfbank(MathUtils.normalize(wav), nFilt = 64) val result = speechFeatures.ssc(MathUtils.normalize(wav), nFilt = 64) ``

The result will contain metrices with the expected features. Pass in these features for further processes (e.g. classification, speech recognition).

--- </details>

Integration

In XCode, go to File > Add Packages...
Paste in the URL of this repo in the search box
Select the package found
Click Add Package button

Example implementation

A sample app is included in this repo to help understand the implementation.

Convert your audio signal in the form of an KotlinIntArray and normalize it.

```swift import KotlinSpeechFeatures

let signal = Int // Example signal let normalized = MathUtils.Companion.init().normalize(sig: toKotlinIntArray(arr: signal))

func toKotlinIntArray(arr: [Int]) -> KotlinIntArray { let result = KotlinIntArray(size: Int32(arr.capacity)) for i in 0...(arr.count-1) { result.set(index: Int32(i), value: Int32(arr[i])) } return result } ```

Initialize speech features

``swift let speechFeatures = SpeechFeatures() ``

Perform any of the 4 operations:

``swift let result = speechFeatures.mfcc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, numCep: 13, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: ni;, preemph: 0.97, ceplifter: 22, appendEnergy: true, winFunc: nil) let result = speechFeatures.fbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil) let result = speechFeatures.logfbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil) let result = speechFeatures.ssc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil) ``

The result will contain metrices with the expected features. Pass in these features for further processes (e.g. classification, speech recognition).

</details>

<details> <summary> JavaScript </summary>

`` Coming soon... ``

</details>

✍️ Contributing

Interested in contributing to the library? Thank you so much for your interest! We are always looking for improvements to the project and contributions from open-source developers are greatly appreciated.

Clone repo and create a new branch:

git checkout https://github.com/merlynmind/kotlin_speech_features -b name_for_new_branch

Make changes and test
Submit Pull Request with comprehensive description of changes

🌟 Spread the word!

If you want to say thank you and/or support active development of this library:

Add a GitHub Star to the project!
Tweet about the project on your Twitter!

Tag @MerlynMind and/or #heyMerlnyn

Thank you so much for your interest in growing the reach of our library!

🧡 Credits

Arjun Sunil - Original Author of kotlin speech features
Raquib-Ul Alam - For major refactoring and making the code presentable
Rob Smith - For Mentoring and helping us to navigate through the task

📝 References

Original library - Python Speech Features
Reference Library - C Speech Features
Sample english.wav was obtained from

wget http://voyager.jpl.nasa.gov/spacecraft/audio/english.au
sox english.au -e signed-integer english.wav

Package Metadata

Repository: emergenceai/kotlin_speech_features

Default branch: main

README: README.md