UPGRADE YOUR SKILLS: Learn advanced Swift and SwiftUI on Hacking with Swift+! >>

SOLVED: Voice Gender Recognition

Forums > iOS

I’m working on a vocal singing app that needs some level of gender and age recognition in order to attempt to improve suggestions of songs to sing.

Does anyone know if there is an Apple SDK or 3rd party SDK (Git Repo or otherwise) that can take a voice sample and attempt to predict the gender and/or age of the speaker? Preference is to make it all on-device to allow off-line use as well as to protect the users information. We have a dedicated server that could be available if on-device is not an option.

We do already have a voice interpreter for key phrases, but the gender portion is (understandably) more of a challenge for a variety of reasons.

As always, thank you in advance to the great community of folks who help each other on this platform.

4      

Keeping in mind that I know nothing about speech recognition...

You could take a look at Apple's Speech framework, particularly the SFSpeechRecognizerResult class. Examining its speechRecognitionMetadata property exposes some voiceAnalytics that may be of use to you:

// Voice analytics corresponding to a segment of recorded audio
@available(iOS 13, *)
open class SFVoiceAnalytics : NSObject, NSCopying, NSSecureCoding {

    // Jitter measures vocal stability and is measured as an absolute difference between consecutive periods, divided by the average period. It is expressed as a percentage
    @NSCopying open var jitter: SFAcousticFeature { get }

    // Shimmer measures vocal stability and is measured in decibels
    @NSCopying open var shimmer: SFAcousticFeature { get }

    // Pitch measures the highness and lowness of tone and is measured in logarithm of normalized pitch estimates
    @NSCopying open var pitch: SFAcousticFeature { get }

    // Voicing measures the probability of whether a frame is voiced or not and is measured as a probability
    @NSCopying open var voicing: SFAcousticFeature { get }
}

(that comes from the header files in Xcode)

Another option might be to use some form of ML. Maybe you could find a data set that would work to train up a model you can then use to analyze incoming speech/song and figure out what you need from that?

I would think that voice gender recognition would be fraught with edge cases and tricky bits, so good luck!

5      

Turns out that the Apple Core ML seems to do the trick. With it, you use the sound classifier, create a collection of sample male and female recordings as well as other non-target recordings (dogs barking, crowd noise, etc...) and use Create ML to train your model.

The machine learning stuff is pretty cool.

4      

Hacking with Swift is sponsored by RevenueCat

SPONSORED Take the pain out of configuring and testing your paywalls. RevenueCat's Paywalls allow you to remotely configure your entire paywall view without any code changes or app updates.

Learn more here

Sponsor Hacking with Swift and reach the world's largest Swift community!

Archived topic

This topic has been closed due to inactivity, so you can't reply. Please create a new topic if you need to.

All interactions here are governed by our code of conduct.

 
Unknown user

You are not logged in

Log in or create account
 

Link copied to your pasteboard.