Ksoup is an elegant Kotlin DSL wrapper for JSoup, designed to make web scraping and HTML parsing in Kotlin more intuitive, type-safe, and maintainable. Perfect for both simple extractions and complex scraping workflows.
- Type-safe Kotlin DSL for HTML parsing
- Seamless JSoup integration with enhanced Kotlin syntax
- Reactive-style data extraction
- Custom HTTP client support
- Clean, maintainable scraping code
- Lightweight with zero runtime dependencies (except JSoup)
import io.mikael.ksoup.KSoup
data class GitHubProfile(var username: String = "", var fullName: String = "")
val profile = KSoup.extract<GitHubProfile> {
result { GitHubProfile() }
url = "https://github.com/mikaelhg"
userAgent = "Mozilla/5.0 Ksoup/1.0"
headers["Accept-Encoding"] = "gzip"
text(".p-name") { text, page ->
page.fullName = text
}
element(".p-nickname") { el, page ->
page.username = el.text()
}
}
class CustomHttpClient : HttpClient {
/* Your implementation here */
}
val profile = KSoup.extract<GitHubProfile> {
httpClient = CustomHttpClient()
url = "https://github.com/mikaelhg"
// Extraction logic...
}
- Basic HTML extraction
- Custom HTTP client support
- Multi-page extraction
- "Next page" iteration
- Enhanced error handling (4xx/5xx responses)
- Async support
- Rate limiting utilities
Ksoup is released under the Apache 2.0 License.