BLACK FRIDAY SALE: Save big on all my Swift books and bundles! >>

How to lemmatize text using NLTagger

Swift version: 5.6

Paul Hudson    @twostraws   

Apple’s NaturalLanguage framework is able to lemmatize text for us, which is the process of converting words to the forms you would find in a dictionary – making plural nouns singular, finding the root forms of conjugated verbs, and so on, while also taking into account the context in which they are used.

To do this, first create an instance of NLTagger enabling its .lemma scheme, then call enumerateTags() on it to find all the root word forms. This will pass you the tag (the root word) if it exists, plus the range of the original text in the string.

So, you could lemmatize a whole sentence like this:

import NaturalLanguage

let text = "This is text with plurals such as geese, people, and millennia."
let tagger = NLTagger(tagSchemes: [.lemma])
tagger.string = text

tagger.enumerateTags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lemma) { tag, range in
    let stemForm = tag?.rawValue ?? String(text[range])
    print(stemForm, terminator: "")
    return true
}

Text lemmatized in this way will be lowercase, preserving any punctuation. So, that snippet will output “this be text with plural such as goose, person, and millennium.”

If you intend to lemmatize text frequently, consider making it an extension on String like this:

extension String {
    func lemmatized() -> String {
        let tagger = NLTagger(tagSchemes: [.lemma])
        tagger.string = self

        var result = [String]()

        tagger.enumerateTags(in: self.startIndex..<self.endIndex, unit: .word, scheme: .lemma) { tag, tokenRange in
            let stemForm = tag?.rawValue ?? String(self[tokenRange])
            result.append(stemForm)
            return true
        }

        return result.joined()
    }
}

With that in place you can now lemmatize text easily:

let text = "This is text with plurals such as geese, people, and millennia."
print(text.lemmatized())
Hacking with Swift is sponsored by RevenueCat

SPONSORED In-app subscriptions are a pain to implement, hard to test, and full of edge cases. RevenueCat makes it straightforward and reliable so you can get back to building your app. Oh, and it's free if your app makes less than $10k/mo.

Learn more

Sponsor Hacking with Swift and reach the world's largest Swift community!

Available from iOS 12.0

Similar solutions…

About the Swift Knowledge Base

This is part of the Swift Knowledge Base, a free, searchable collection of solutions for common iOS questions.

BUY OUR BOOKS
Buy Pro Swift Buy Pro SwiftUI Buy Swift Design Patterns Buy Testing Swift Buy Hacking with iOS Buy Swift Coding Challenges Buy Swift on Sundays Volume One Buy Server-Side Swift Buy Advanced iOS Volume One Buy Advanced iOS Volume Two Buy Advanced iOS Volume Three Buy Hacking with watchOS Buy Hacking with tvOS Buy Hacking with macOS Buy Dive Into SpriteKit Buy Swift in Sixty Seconds Buy Objective-C for Swift Developers Buy Beyond Code

Was this page useful? Let us know!

Average rating: 4.8/5

 
Unknown user

You are not logged in

Log in or create account
 

Link copied to your pasteboard.