Skip to content

Commit

Permalink
Handle new Kindle Notes page (#16)
Browse files Browse the repository at this point in the history
Some time ago (not sure exactly when), Amazon updated their Kindle
"Notes and Highlights" page. First, they changed the URL to this page,
and secondly, they completely changed the structure of the page.

Previously, it was possible to figure out which URL was serving up
the JSON which drove the highlights, and after logging in, simply fetch
highlights via that URL. As best I can tell, that has also changed. Now,
the URL which serves up the highlights data returns HTML.

As a result, I made some significant changes.

In the interest of [overcoming my obsession with stringly-typed
Ruby](http://confreaks.tv/videos/rubyconf2014-overcoming-our-obsession-with-stringly-typed-ruby),
I've introduced `Book` and `Highlight` classes. Highlights are fetched
through the `Book` class, via the Mechanize agent.

Additionally, highlights are losing the following information:

```
"customerId"
"embeddedId"
"endLocation"
"howLongAgo"
"startLocation"
"timestamp"
```

Unfortunately, it became too difficult at this time to figure out how to
get that information from the "new" endpoint. I welcome anyone taking a
shot at getting that information back into the `Highlight` class.

Since these are breaking changes, I will bump the version to 2.0 and cut
a new release sometime soon.
  • Loading branch information
speric authored Nov 17, 2017
1 parent bbbb6c8 commit 5f55acf
Show file tree
Hide file tree
Showing 11 changed files with 298 additions and 356 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@
.gemspec
config/*.yml
.DS_Store
application.yml
Gemfile.lock
47 changes: 0 additions & 47 deletions Gemfile.lock

This file was deleted.

79 changes: 37 additions & 42 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,6 @@ A Ruby gem for collecting your Kindle highlights.
* Ruby `2.1.0` or greater
* An Amazon Kindle account

<b>Note:</b> Version `0.0.8` of `kindle-highlights` is the last version which is compatible with older
versions of Ruby. For documentation on how to use that
version, see [the release](https://github.com/speric/kindle-highlights/releases/tag/v0.0.8).

### Install
```
gem install kindle-highlights
Expand All @@ -26,65 +22,64 @@ to sign into your Amazon Kindle account:
```ruby
require 'kindle_highlights'

kindle = KindleHighlights::Client.new(email_address: "email.address@gmail.com", password: "password")
kindle = KindleHighlights::Client.new(
email_address: "email.address@gmail.com",
password: "password"
)
```

### Fetching a list of your Kindle books

Use the `books` method to get a listing of all your Kindle books. This method
returns a hash, keyed on the ASIN, with the title as the value:
returns a collection of `KindleHighlights::Book` objects:

```ruby
kindle.books
#=>
{
"B002JCSCO8" => "The Art of the Commonplace: The Agrarian Essays of Wendell Berry",
"B0049SPHC0" => "Calvinistic Concept of Culture, The",
"B003HNOB34" => "The Collected Works of William Butler Yeats (Unexpurgated Edition) (Halcyon Classics)",
"B000JMKZX6" => "The Essays of Arthur Schopenhauer; On Human Nature",
"B005CQ2ZE6" => "From the Garden to the City",
"B0082ZJFCO" => "The Golden Sayings of Epictetus",
"B000SEGEKI" => "The Pragmatic Programmer: From Journeyman to Master",
"B009D6AGOM" => "The Rare Jewel of Christian Contentment",
"B00E25KVLW" => "Ruby on Rails 4.0 Guide",
"B004X5RLBY" => "The Seven Lamps of Architecture",
"B0032UWX1O" => "The Westminster Confession of Faith",
"B0026772N8" => "Zen and the Art of Motorcycle Maintenance"
}
[
<KindleHighlights::Book:
@asin="B000XUAETY",
@author="James R. Mcdonough",
@title="Platoon Leader: A Memoir of Command in Combat"
>,
<KindleHighlights::Book:
@asin="B003XDUCEU",
@author="Michael Lopp",
@title="Being Geek: The Software Developer's Career Handbook"
>,
<KindleHighlights::Book:
@asin="B00JJ1RIO2",
@author="James K. A. Smith",
@title="How (Not) to Be Secular: Reading Charles Taylor"
>
]
```

### Fetching all highlights for a single book

To get only the highlights for a specific book, use the `highlights_for` method, passing
in the book's Amazon ASIN as the only method parameter:
in the book's Amazon ASIN as the only method parameter. This method returns a collection of
`KindleHighlights::Highlight` objects:

```ruby
kindle.highlights_for("B005CQ2ZE6")
#=>
[
{
"asin" => "B005CQ2ZE6",
"customerId" => "...",
"embeddedId" => "From_the_Garden_to_the_City:420E805A",
"endLocation" => 29591,
"highlight" => "One of the most dangerous things you can believe in this world is that technology is neutral.",
"howLongAgo" => "1 year ago",
"startLocation" => 29496,
"timestamp" => 1320901233000
},
{
"asin" => "B005CQ2ZE6",
"customerId" => "...",
"embeddedId" => "From_the_Garden_to_the_City:420E805A",
"endLocation" => 54220,
"highlight" => "While God's words are eternal and unchanging, the tools we use to access those words do change, and those changes in technology also bring subtle changes to the practice of worship. When we fail to recognize the impact of such technological change, we run the risk of allowing our tools to dictate our methods. Technology should not dictate our values or our methods. Rather, we must use technology out of our convictions and values.",
"howLongAgo" => "1 year ago",
"startLocation" => 53780,
"timestamp" => 1321038422000
}
<KindleHighlights::Highlight:0x007fc4e7e03ea0
@asin="B005CQ2ZE6",
@text="One of the most dangerous things you can believe in this world is that technology is neutral.",
@location="197"
>
]
```

Additionally, each book has it's own `highlights_from_amazon` method:

```
book = kindle.books.first
book.highlights_from_amazon
```

### Advanced Usage

This gem uses [mechanize](https://github.com/sparklemotion/mechanize) to interact with Amazon's Kindle pages. You can override any of the default mechanize settings (see `lib/kindle_highlights/client.rb`) by passing your settings to the initializer:
Expand Down Expand Up @@ -138,4 +133,4 @@ kindle = KindleHighlights::Client.new(

### Copyright

Copyright (c) 2011-2016 Eric Farkas. See MIT-LICENSE for details.
Copyright (c) 2011-2018 Eric Farkas. See MIT-LICENSE for details.
8 changes: 5 additions & 3 deletions kindle_highlights.gemspec
Original file line number Diff line number Diff line change
@@ -1,18 +1,20 @@
Gem::Specification.new do |s|
s.name = "kindle-highlights"
s.version = "1.0.2"
s.version = "2.0.0"
s.summary = "Kindle highlights"
s.description = "Until there is a Kindle API, this will suffice."
s.authors = ["Eric Farkas"]
s.email = "eric@prudentiadigital.com"
s.files = ["lib/kindle_highlights.rb", "lib/kindle_highlights/client.rb"]
s.files = `git ls-files -- lib/*`.split("\n")
s.files += ["MIT-LICENSE"]
s.homepage = "https://github.com/speric/kindle-highlights"
s.license = "MIT"

s.required_ruby_version = ">= 2.1.0"

s.add_runtime_dependency "mechanize", ">= 2.7.2"
s.add_runtime_dependency "mechanize", ">= 2.7.5"
s.add_development_dependency "rake"
s.add_development_dependency "bundler", "~> 1.3"
s.add_development_dependency "minitest", "~> 5.0"
s.add_development_dependency "activesupport"
end
12 changes: 5 additions & 7 deletions lib/kindle_highlights.rb
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
require 'rubygems'
require 'mechanize'
require 'json'
require 'kindle_highlights/client'
require 'active_support/core_ext/object/blank'
require 'active_support/core_ext/string/filters'

module KindleHighlights
KINDLE_LOGIN_PAGE = "http://kindle.amazon.com/login"
SIGNIN_FORM_IDENTIFIER = "signIn"
BATCH_SIZE = 200
end
require_relative './kindle_highlights/client'
require_relative './kindle_highlights/book'
require_relative './kindle_highlights/highlight'
56 changes: 56 additions & 0 deletions lib/kindle_highlights/book.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
module KindleHighlights
class Book
attr_accessor :asin, :author, :title

def self.from_html_elements(html_element:, mechanize_agent:)
new(
mechanize_agent: mechanize_agent,
asin: html_element.attributes["id"].value.squish,
title: html_element.children.search("h2").first.text.squish,
author: html_element.children.search("p").first.text.split(":").last.strip.squish
)
end

def initialize(asin:, author:, title:, mechanize_agent: nil)
@asin = asin
@author = author
@title = title
@mechanize_agent = mechanize_agent
end

def to_s
"#{title} by #{author}"
end

def inspect
"<#{self.class}: #{inspectable_vars}>"
end

def highlights_from_amazon
return [] unless mechanize_agent.present?

@highlights ||= fetch_highlights_from_amazon
end

private

attr_reader :mechanize_agent

def fetch_highlights_from_amazon
mechanize_agent
.get("https://read.amazon.com/kp/notebook?captcha_verified=1&asin=#{asin}&contentLimitState=&")
.search("div#kp-notebook-annotations")
.children
.select { |child| child.name == "div" }
.select { |child| child.children.search("div.kp-notebook-highlight").first.present? }
.map { |html_elements| Highlight.from_html_elements(book: self, html_elements: html_elements) }
end

def inspectable_vars
instance_variables
.select { |ivar| ivar != :@mechanize_agent }
.map { |ivar| "#{ivar}=#{instance_variable_get(ivar).inspect}" }
.join(", ")
end
end
end
Loading

0 comments on commit 5f55acf

Please sign in to comment.