NEW: Master Swift design patterns with my latest book! >>

How to extract text from a PDF using PDFKit

Written by Paul Hudson    @twostraws

PDFKit comes with a built-in class called PDFDocument, which allows us to load and parse PDF documents. It’s used when you want to put your PDF into a PDFView, but it’s also useful when you just want to read text from the PDF: you can loop over each page in the PDF, read its attributedString property, then append it to an attributed string containing all the text from the PDF.

Here’s some example code to do just that:

if let pdf = PDFDocument(url: url) {
    let pageCount = pdf.pageCount
    let documentContent = NSMutableAttributedString()

    for i in 1 ..< pageCount {
        guard let page = pdf.page(at: i) else { continue }
        guard let pageContent = page.attributedString else { continue }
        documentContent.append(pageContent)
    }
}

It’s an attributed string, so it will retain formatting from the PDF as best as it can.

Available from iOS 11.0 – learn more in my book Practical iOS 11

Did this solution work for you? Please pass it on!

Other people are reading…

About the Swift Knowledge Base

This is part of the Swift Knowledge Base, a free, searchable collection of solutions for common iOS questions.

Learn Swift faster!

Take your Swift learning to the next level: buy the Hacking with Swift e-book and get bonus material to help you learn faster!

Click here to visit the Hacking with Swift store >>