Category Archive: Geekery

Live Word Count in VIM!

A live word count (one updated as you type) is a feature I love when writing non-technical documents and non-programs.

For instance, newer versions of Wordpress feature this at the bottom of your new post text box (though it waits until you’ve hit a rest period in your typing before it updates the count, although this is a very short rest period indeed). It’s also a great feature of surprisingly few author-orientated writing programs (Scrivener is one).

However, for items that are neither Wordpress blog posts nor fiction, I prefer to use vim, because you can’t take the programmer out of the writer sometimes. Heck, even for blog posts or fiction I often prefer vim (especially in the area of syntax highlighting). But sadly, for a long time, vim did not come with a live word count.

Until now.

Well, sort of. There’s a version that’s been cooked up by one of the commenters on Stack Overflow.

I use this just in macvim, not terminal vim, because macvim is for more extended projects/programming, and terminal vim is for quick and dirty scripts. Thus I only really care about wordcount in the Mac equivalent of gvim.

But no worries; if you use .gvimrc in addition to .vimrc, then .gvimrc acts as overrides for things in .vimrc. Excellent, of course.

So this went into my .gvimrc (and really, it should go into a separate file plugin or something in my .vim directory, but I’m too tired to care at the moment):

"-------------- word count ---------------
" from http://stackoverflow.com/questions/114431/fast-word-count-function-in-vim/120386#120386

"returns the count of how many words are in the entire file excluding the current line
"updates the buffer variable Global_Word_Count to reflect this
fu! OtherLineWordCount()
    let data = []
    "get lines above and below current line unless current line is first or last
    if line(".") > 1
        let data = getline(1, line(".")-1)
    endif
    if line(".") < line("$")
        let data = data + getline(line(".")+1, "$")
    endif
    let count_words = 0
    let pattern = "\\<\\(\\w\\|-\\|'\\)\\+\\>"
    for str in data
        let count_words = count_words + NumPatternsInString(str, pattern)
    endfor
    let b:Global_Word_Count = count_words
    return count_words
endf    

"returns the word count for the current line
"updates the buffer variable Current_Line_Number
"updates the buffer variable Current_Line_Word_Count
fu! CurrentLineWordCount()
    if b:Current_Line_Number != line(".") "if the line number has changed then add old count
        let b:Global_Word_Count = b:Global_Word_Count + b:Current_Line_Word_Count
    endif
    "calculate number of words on current line
    let line = getline(".")
    let pattern = "\\<\\(\\w\\|-\\|'\\)\\+\\>"
    let count_words = NumPatternsInString(line, pattern)
    let b:Current_Line_Word_Count = count_words "update buffer variable with current line count
    if b:Current_Line_Number != line(".") "if the line number has changed then subtract current line count
        let b:Global_Word_Count = b:Global_Word_Count - b:Current_Line_Word_Count
    endif
    let b:Current_Line_Number = line(".") "update buffer variable with current line number
    return count_words
endf    

"returns the word count for the entire file using variables defined in other procedures
"this is the function that is called repeatedly and controls the other word
"count functions.
fu! WordCount()
    if exists("b:Global_Word_Count") == 0
        let b:Global_Word_Count = 0
        let b:Current_Line_Word_Count = 0
        let b:Current_Line_Number = line(".")
        call OtherLineWordCount()
    endif
    call CurrentLineWordCount()
    return b:Global_Word_Count + b:Current_Line_Word_Count
endf

"returns the number of patterns found in a string
fu! NumPatternsInString(str, pat)
    let i = 0
    let num = -1
    while i != -1
        let num = num + 1
        let i = matchend(a:str, a:pat, i)
    endwhile
    return num
endf

"example of using the function for statusline:
"set statusline=wc:%{WordCount()}

"-------------------------------------------

set statusline=%<\%f\ %y%m%r\ wc:%{WordCount()}%=%l,%c%V\ \ %L\ lines:%P\  

The last piece of configuration is my normal statusline with word count inserted, so it shows up like this:

gvimrc status line with word count

You can get more help with the statusline codes in vim via :help statusline, or visit the vim online documentation project, at ’statusline’. (You can search the rest of the online documentation as well.)

This has been my geekout for the day.

  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!

RubyEpub Tools (ruby-epub) 0.0.2 Released

Added the ‘add-to-ncx’ operation on the epub script, and removed the creation of the template HTML file, which just got in the way.

See the ruby-epub GoogleCode page.

  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!

RubyEpub Tools (ruby-epub) 0.0.1 Released

Right now this is a very minimal bundle of functionality. Basically it’s my create/add-buncha-files/compile script, and not much else. It’ll work on Mac OS X and any Unix. Windows is, on the other hand, special. I don’t have a Windows box, so I don’t know.

On the ruby-epub Google Code page is a featured download (ruby-epub-0.0.1.gem) and a featured wiki page (”Installing”), and also the road map (more a check list) prominently displayed.

  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!

Perfecting (Simple) PDF Conversion to EPub and Mobipocket

pdf-icon

Problem: Convert PDF to reflowable text, preferably HTML.

Why: This is because text that reflows based on the size of the screen, or the size of the font, or the length of a page (or indeed, without the concept of a page) is what suits mobile ebook readers, with smaller screens, best.

Apologies for two Geekery posts in a row. The rest of the discussion is under the fold.

Click here to read more »

  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!

LRF to HTML: The Rough Guide

As of this writing, calibre, which can convert many things from one format to another featuring command-line tools, does not convert LRF to HTML, or indeed, to most anything else other than LRS, an XML format. Currently this is not a high-priority item to fix in calibre itself, because calibre is aimed at converting things to LRF. (The ePub conversion is still relatively new and shiny.)

ETA: Here’s the LRS specification.

So. Heck. Why not. I’m using Ruby, by the way, because Ruby has the kick-ass REXML library, which also forms the cornerstone for my ruby-epub stuff (still in the making).

Geekery after the cut.

Click here to read more »

  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!

State of the Union: A Very Basic Ruby Epub Library

Currently I enjoy, through hacks and such, easy ways to create an ePub project directory, update the OPF and NCX files within, compile it all, and even run epubcheck, all very easily.

I’m starting to refactor and redesign all that, with an eye to providing a Ruby library that allows manipulation of various parts of Epub, as well as a Project class tying the elements together (after all, when a unique identifier needs to be synced between two files with different formats of almost entirely different inheritances, it gets a bit annoying to do manually).

One day I’ll probably stick this all in a RubyCocoa interface, so that we have an opposite number to the Windows-only Mobipocket tools.

This library will one day, given the blessings of the RubyForge administrators, become a gem for people to play with. Right now it resides over at https://ruby-epub.googlecode.com/, where you can see a roadmap and browse the code and check-ins.

The state of the library is that it’s in very primitive mode at the moment, and some things still need to be tested more thoroughly (and some things are already tested fairly intensively, but you can almost always use more), so it’s not yet released. Right now there are two scripts, and

At this point we can create a basic epub and then compile it immediately and the result passes epubcheck.

as part of the last check-in proclaims.

It’s all GPLv3 licensed by the way. I like things to be open, because then people can patch stuff, or use the library and do other things of their interest, or suggest more in-depth design changes I’m missing out because I’m not an in-depth Ruby-ist, and so on. Not that I’ll listen to all of it (another reason I like things to be open: people can branch). I’m not sure how I’ll deal with things once it all goes Cocoa, but I look forwards to the future with optimism.

(And also to the new Subversion and its changelists. Changelists are godsends.)

(Now is not the time to argue with me about using wxWidgets or the benefits of being truly cross-platform. No, it really isn’t, unless you want to end up quarantined for a while. I care about as much as anybody writing Mac programs cares, which is about the amount you can fill in a thimble.)

(Nor is this time to tell me to use Python, like about the rest of the Epub tools out there use, save for the few in Java. I’ve used Python. I’m probably one of the few people who likes the syntax, in fact, but at the moment Ruby needs a library and I work in Ruby. Pardon me for being selfish, but I am indeed both open and selfish at the same time.)

During all this I ended up learning Rake and part of Gem creation, and also dove a bit more into Ruby, and thus wrote up a couple quick references (now featured on the new Quick! page).

By the way, things about my coding style and approach:

  • Copious comments, unless I’ve obviously rushed things (in which case I feel horrible). One of the first stops I made during my Ruby crash course was to find out how Rdoc worked.

  • Object-orientated, modular design. I really like responsibilities to belong to cohesive units that can be called by other cohesive units. Crazy, I know.

  • I tend not to do wild and crazy kool-kid hacker things, because I like my code to be readable. And yes, some people do think I’m stupid because I don’t use unless or dance around with the trinary operator and bit-mode flags and re-implementing my own XML parser, but whatever.

  • I don’t worry about speed and rock-hard reliability against all edge-cases up front. I add that in later, and because the devil’s advocate unit tests keep failing until I do.

  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!

Mike and Psmith, Psmith in the City Epub Versions

In celebration of moving my downloads over to WP DownloadManager, I decided to release Epub versions of Mike and Psmith and Psmith in the City. ETA: And also Psmith, Journalist. For more about Psmith, see my Kindle-licious series.


  Mike and Psmith: Epub (157.6 KiB, 151 hits)
  Psmith in the City: Epub (157.9 KiB, 180 hits)
  Psmith, Journalist: Epub (167.1 KiB, 173 hits)

I’ve written this warning a few times before, but I might as well do it again:

Warning Warning Warning

The above texts are public domain only in the United States, anywhere with a Berne-convention-style copyright that expires 25 years after author death, and anywhere else without copyright laws.

If you live anywhere else, especially in Canada, Mexico, the United Kingdom, Ireland, every country in continental Europe, almost every country in Asia, South America, and Africa—these are not legal for you to download, read, read aloud, print, or store on a computer or server unless it happens to be housed in the United States, etc etc etc.1

For more information, see Copyright and Wodehouse.

End Warning

I wrote a few scripts to make the Epub process a snap for those of us working by hand. I’m not ready to release them, but here’s an example session (warning: extreme geek):

Click here to read more »

  1. If you think this is ridiculous, join the club. I’m not against copyright in general—far from it—but the man’s been dead for over 25 years. []
  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!

Updated: New and Faster Journey to the West ePub

I noticed that Journey to the West got slower and slower to read in Adobe Digital Editions. I found out why, and fixed it. Here are the download links again:

Download:

  Journey to the West: Epub (1.5 MiB, 420 hits)
  Journey to the West: Kindle/Mobipocket (2.1 MiB, 435 hits)

License: Creative Commons Attribution-NonCommercial 2.5
Attribution: Based from work copyrighted 2005 by Silk Pagoda (also CC Attribution Non-Commercial 2.5 Licensed).

The reason Adobe Digtial Editions was slow on the ePub version is based on the structure of ePub versus the structure of Mobipocket.

Say that all of my book’s content is in one HTML file. When I stick that into ePub, it’s still just one HTML file. Adobe Digital Editions—and really, just about any ePub reader—unzips and puts the entire file into memory.

That’s fine if the file is, say, the size of a 50 page story. But for a story that’s over 1200 pages long, that method is going to run into problems.

Solution: break up the huge HTML file into 100 much smaller files—one for each chapter.

Now as you go through the book, Adobe Digital Editions will only have perhaps a few small chapters in memory—the ones immediately before and after your position in the overall book.1 This speeds everything up considerably. And thus you can read as quickly at Chapter 1 as you can at Chapter 100.

Why didn’t this problem affect the Mobipocket version? Because the mobigen compiler automatically does what I had to do for the ePub—it automatically broke up the longer book into 100 separate files2, and thus the reader never had more than a few chapters loaded at a time.

And this is where I say Computer Science! and drop off for a good night’s rest.

  1. Or Adobe Digital Editions is dumber than I think it is, and only has the current chapter in memory. Still, it’s better than having all 100 loaded. []
  2. 100 separate records in its internals, which is actually a Palm database. []
  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!

Creating eBooks: An ePub Tutorial

Most recent version of this document is now at the Spontaneous Derivation Wiki.

This is a step-by-step tutorial, with example, of making a standards-compliant ePub book by hand.

We’ll be using the public domain (in both illustrations and text) book The Velveteen Rabbit. It has the following good qualifications as a tutorial example:

  • Small.
  • Exists in HTML form in the public domain.
  • Tiny table of contents, but a table of contents still exists.
  • Images.

Click here to read more »

  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!

Fooling Around with Amazon Images

I’m a fussy person. I wanted to show books in my library on my blog sidebar—but not just in any old way.

Update: Fixed the code samples and script.

Requirements

  • One or more things in the sidebar that shows books in my library.

  • I want to show books I’ve read, books I’m currently reading (and, in some cases, re-reading), and books I will read.

  • Preferably separated from each other.

  • I want to be able to adjust the number of books shown in each category.

  • I want someone else to worry about all the little book images and not have to store/resize them myself.

  • I want flexibility in linking; sometimes I want to link to reviews I’ve written, for instance, and sometimes direct to Amazon, Audible, Webscriptions, etc.

  • If I’m linking to Amazon, I want my Amazon Associates code attached. Optionally, if any other stores have associates programs, I want to use those tags too.

Widgets from Book Social Networks

None of the widgets from Shelfari, GoodReads, or LibraryThing could satisfy these requirements.

The widgets at GoodReads came closest, but in the end they weren’t flexible enough.

Now we descend into high geekery, including ruby code, so the rest of this goes under the cut.

Click here to read more »

  • del.icio.us
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • BlinkList
  • Twitter
  • Facebook
  • Digg
  • Yahoo! Bookmarks
  • Propeller
  • Sphinn
  • Turn this article into a PDF!
  • E-mail this story to a friend!
This site uses a Hackadelic PlugIn, Hackadelic SEO Table Of Contents 1.6.0.