TEAM LICENSES: Save money and learn new skills through a Hacking with Swift+ team license >>

SOLVED: Is there a regex for this?

Forums > Swift

Not sure how to approach this problem, but I think it's perfect for a regex. Sadly, I don't know eniugh regex to make it work...

I have a block of text and i want to highlight parts of it. In my case, highlighting means surrounding the found text with html span statements. The search text needs to be case insensitive and I need to retain the case in the output. Best illustrated with an example;

Input: "This is a sentence with Mars in it which is a remarkable planet". Search text "mar" Expectred output: "This is a sentence with #span#Mar#/span#s in it which is a re#span#mar#/span#kable planet".

(I've used #span# because I can't use the actual tag in a post).

Is this possible?

My ultimate aim it to be able to highlight search text in an HTML document that I am generating from some Markdown. The conversion from Markdown to HTML is fine; the only stumbling block left is this highlighting. I tried using string.replacingOccurrences, but that doesn't give me access to the 'originlal' text, so the results do not reflect the input.

Push comes to shove, I'll go back to replacingOccurrences and make the output all uppercase. It's not ideal though.

Thanks for any suggestions.

   

Close, but no cigar!

        if !vm.highlightText.isEmpty {

            let search = "(?'search'\(vm.highlightText))"
            if let regex = try? Regex(search) {
                noteText.repl
                noteText = noteText.replacing(regex.ignoresCase(),
                                              with: "## $search ##")
            }
            print(noteText)
        }

This correctly identifies the text I am searching for and puts it into a named capture. Sadly, the replacing doesn't use the named capture.

   

I reverted to NSRegularExpression, but that was a bust because it doesn't handle named groups. So, I've ended up with a bit of a hack:

    private func createHilight(text: String, highlight: String) -> String {
        let regexText = "(\(highlight))"
        let replaceText = "<span class='highlight'>$search</span>"

        guard let regex = try? NSRegularExpression(pattern: regexText,
                                                   options: .caseInsensitive) else {
            // failed to create regex, return the original string
            return text
        }

        // Get the matches, so we have access to the original text
        let textRange = NSRange(location: 0, length: text.count)
        let resultValues = regex.matches(in: text, range: textRange)

        // Do the replacement
        var resultText = regex.stringByReplacingMatches(
            in: text,
            options: .withTransparentBounds,
            range: textRange,
            withTemplate: replaceText
        )

        // Now put the original text back
        for match in resultValues {
            if let textRange = Range(match.range, in: text) {
                let originalValue = text[textRange]
                resultText = resultText.replacing("$search", with: originalValue, maxReplacements: 1)
            }
        }

        return resultText

    }

I run the regex to get the matches, which returns the original matching text it found. After that, I run the regex to do the replacement, which gives me the target string but with a placeholder for the original text. I then loop through the matches we found in the original string and plug them into the result of the regex replacement.

It's a massive hack, but my options seem to be severely limited. I hate this code for so many reasons, but it works and lets me move on to the next part of the app until i can properly learn regex and better understand what I can do in Swift.

   

Hacking with Swift is sponsored by RevenueCat.

SPONSORED Take the pain out of configuring and testing your paywalls. RevenueCat's Paywalls allow you to remotely configure your entire paywall view without any code changes or app updates.

Learn more here

Sponsor Hacking with Swift and reach the world's largest Swift community!

Do you have to use named captures? Seems like for such a simple use case, they aren't really necessary and a simple positional capture would work just fine.

private func createHighlight(in text: String, highlighting: String) -> String {
    guard let regex = try? NSRegularExpression(
        pattern: "\\b(\(highlighting))\\b",
        options: .caseInsensitive
    ) else {
        return text
    }
    let range = NSRange(text.startIndex..<text.endIndex,
                        in: text)
    let result = regex.stringByReplacingMatches(in: text,
                                                options: [],
                                                range: range,
                                                withTemplate: #"<span class="highlight">$1<\\span>"#)
    return result
}

let sampleText = """
Aliqua lorem cillum commodo sit pariatur adipiscing nulla fugiat excepteur velit non proident ipsum ullamco cupidatat proident culpa qui incididunt proident nostrud consequat ea dolor id sit quis enim dolor esse in id exercitation excepteur deserunt ipsum laborum est id dolore irure consequat sed culpa laboris ex aliquip incididunt esse
"""

print(createHighlight(in: sampleText, highlighting: "esse"))

(Note: I changed up the function name a bit because your original function was named createHilight but everywhere else you use highlight and I found the two different spellings for no good reason confusing and aesthetically undesirable.)

Anyway, the above code results in this output:

Aliqua lorem cillum commodo sit pariatur adipiscing nulla fugiat excepteur velit non proident ipsum ullamco cupidatat proident culpa qui incididunt proident nostrud consequat ea dolor id sit quis enim dolor <span class="highlight">esse<\span> in id exercitation excepteur deserunt ipsum laborum est id dolore irure consequat sed culpa laboris ex aliquip incididunt <span class="highlight">esse<\span>

Isn't this the kind of result you're looking for?

   

You can use <span> in a code block like this:

// Put your html code inside a code block.
<span>Barnett<\span>

   

@roosterboy I tried many variations of the code you have offered. As you say, I don't need the named groups. What I did miss, being somewhat new to regex, was putting the # symbols around the withTemplate parameter, so my template became just a string and not a replacement regex.

Doesn't matter how many times I looked at it, I just did not spot that one.

Many thanks. I can stop fiddling with this code now...

p.s. I already renamed the func in my real code. Like you, I want code to be meaningful and consistent.

   

Nota bene

This example from the github tacks on an extension to String to add **bold markup** to a string using Regex to find and replace.

This looks like what you're trying to accomplish? Also, side benefit, you have a useful extension to String.

Keep Coding

See-> Regex and Markdown Extension

   

@roosterboy I tried many variations of the code you have offered. As you say, I don't need the named groups. What I did miss, being somewhat new to regex, was putting the # symbols around the withTemplate parameter, so my template became just a string and not a replacement regex.

Those # around the string are what's known as Extended String Delimiters.

I used them here to avoid having to escpe the " inside the span tag. Without the extended string delimiters, I would have had to write it as "<span class=\"highlight\">$1<\\span>"

   

So, I ended up with this (a minor variation on the code you posted and formatted to (a) better fit my screen size and (b) stop Swiftint moaning about line lengths (I will probably compress it again when it gets irritating to have this layout):

    fileprivate func createHighlight(text: String, highlight: String) -> String {
        let regexText = "(\(highlight))"
        let replacement = #"<span class='highlight'>$1</span>"#

        guard let regex = try? NSRegularExpression(
            pattern: regexText,
            options: .caseInsensitive
        )
        else {
            // Regex failed, so return the original string unchanged
            return text
        }

        let range = NSRange(
            text.startIndex..<text.endIndex,
            in: text
        )

        let result = regex.stringByReplacingMatches(
            in: text,
            options: [],
            range: range,
            withTemplate: replacement
        )

        return result
    }

I'm taking the user entered markdown text, running it through this highlight function to insert span statements around the highlight and then running the markdown through MarkdownKit to convert it to HTML for my preview. So, when I do a find, I end up with properly highlighted text in the tree (using AttributedString) and in the note preview using HTML:

Window with filter

I added this functionality because I thought my find wasn't working. i was finding a string and getting results I didn't expect. By highlighting the results, I can see wht is being found and can exclude those that are irrelevant.

Thanks, both, for your support.

   

Update

I've been hacking with the new, improved regular expression builder provided in updates for iOS 16+. This is a departure from the ancient hieroglyphs ^𐦆𐦝𐦟𐦉𐦃[a-z]* popular in times BC (before covid).

In this code snip, I used the newer expression builder to create a Regex pattern to search for one occurance of your user's input text, ignoring string case. Then I tacked on a transformation closure that changes the found pattern by wrapping it in your html span tags. Cool thing about the closure, you may need to apply additional rules and the closure is a sweet place for those rules. The original text, and the transformation are the contents the Regex's output structure. You can reference the actual match using the .0 suffix. You access the transformed output using the .1 suffix.

Then I use this pattern in a String .replacing() method that accepts a Regex for the search. I supply a terse closure that contains the found string's replacement.

Updated using Regex Builder syntax

import Foundation
import RegexBuilder  // <-- Required for Regex Builder functions

var originalText  = "Marla is in the cold marina's market with old German Marks for AmArGolD Corp."

class highlight {
    var searchText:      String
    var originalText:    String
    var highlightedText: String {
        // This creates a searchPattern, with a trick.        
        // If this searchPattern can capture some text,
        // it then transforms the found text by wrapping it in an html span. Cool! 😎
        let searchPattern = TryCapture { One(searchText).regex.ignoresCase()    }
                            transform: { "<span class='highlight'>\($0)</span>" }

        // This replaces any matches of the regex searchPattern with the transformed text.
        return originalText.replacing(searchPattern) { $0.1 }
    }

    required init(_ thisText: String, inText: String) {
        searchText   = thisText
        originalText = inText
    }
}

Test Cases

Paste the code above into Playgrounds. Then use this code to test.

// Test Case == Highlight OLD ===============
var setup = highlight("old", inText: originalText)
print(setup.originalText + "\n")
print(setup.highlightedText)
// Test Case == Highlight MAR ===============
setup.searchText = "mar"
print(setup.highlightedText)

Test Output

// Marla is in the cold marina's market with old German Marks for AmArGolD Corp.             

// Marla is in the c<span class='highlight'>old</span> marina's market with <span class='highlight'>old</span> ....snip....        
// <span class='highlight'>Mar</span>la is in the cold <span class='highlight'>mar</span>ina's <span class='highlight'>mar</span>ket .....snip.....   

Keep Coding!

   

Excellent, thank you for this. I've managed to avoid regex for many years because of the cryptic way its coded, so RegexBuilder looks like a definite step forward. It makes for considerably more readable code which is, after all, a very desirable outcome. Also, having the option to modify the substitution seems to offer endless possibilities, though not in this simple case.

I converted a C# markdown formatter. while ago that used string manipulation to convert markdown to HTML. While it worked and the transform was relatively quick, it did suffer occasional slowdowns in the tests. It really needed to be regex, but regex was impenetrable at the time and the only regex based open source I could find was in PHP. Having converted the code to Swift, very few of the regex worked, which is where I discovered that it was going to take months to convert and test and I couldnt rely on the eventual results. So I gave up and started something more useful.

Seems like it might be time to revisit that decision and start a fresh conversion. I've always felt a little 'inadequate' for not knowing regex.

I know there are packages out there that do this already, but where's the fun in that. I also prefer my projects to not be dependant on someone elses package.

   

Hacking with Swift is sponsored by RevenueCat.

SPONSORED Take the pain out of configuring and testing your paywalls. RevenueCat's Paywalls allow you to remotely configure your entire paywall view without any code changes or app updates.

Learn more here

Sponsor Hacking with Swift and reach the world's largest Swift community!

Reply to this topic…

You need to create an account or log in to reply.

All interactions here are governed by our code of conduct.

 
Unknown user

You are not logged in

Log in or create account
 

Link copied to your pasteboard.