tl;dr: A list of useful resources aimed to self-publish a book on Amazon using Bookdown.
- Writing style
- Did I use any editor?
- How to create the book: Bookdown!
- Self-publishing on Amazon (Kindle and paperback)
- Costs and earnings
- Publishing outside Amazon: Gumroad
- Linking: B&W, color and Kindle on Amazon
- 11-tips to write good
Update Ago-28-2018: I published another blog post related to this one, which contains more technical aspects of Bookdown.
Update Nov-13-2018: The two "how-to" posts won the first place in the 1st Bookdown Contest by RStudio 😃
A friend of mine told me to write down all the details about self-publishing a book.
So here you go—a long post explaining almost all the things I have done and discovered.
First of all, my thanks go to Bookdown. This R package allows enthusiastic people to self-publish a book! Although the book is based on R language, this process can be applied to any kind of book.
Kaylen Sanders from OpenDataScience.com did a poetic review of it.
The two-year journey culminated with two paperback versions available on Amazon (Color and B&W), a Kindle version, an epub version, a PDF, and a website.
I didn't plan to write a book. Around six years ago, I started using R and, as with many programmers, my "personal" library with many shortcuts began to grow.
Then I thought that this library could help more people, so after arranging lots of things in the right place, I published the 'funModeling' package on CRAN in February 2016.
Google and the book R Packages from Hadley Wickham (April 2015) were incredibly useful. Don't hesitate to check out that book if you plan to write a package.
I deeply believe that when there is an explanation behind what we use and what we do, it changes the way we perceive the action. So, I started to document the funModeling package functions.
The documentation grew rapidly, and soon escaped from the original scope of the package to include general explanations of machine learning and data preparation, and then the first version of the book was born!
Two months after the release, I rewrote everything from scratch.
There are two key points here:
- We don’t always we have a clear goal—we just "walk" and that goal takes shape.
- The first version is not always the ultimate one; start now with the ideas you already have and let them grow.
I wanted to "write everything I know"—things that took me a lot of time to learn—and expose the concepts with examples, lots of examples, so the reader can check them and extract their own conclusions.
The other remarkable point was on 'how to interpret all the results.” I found that when someone explains the analytical thinking path, extracting different conclusions from the analysis; then the undersanding around the topic is boosted.
These two books are aligned with the last idea:
Data Mining: Concepts and Techniques 3rd Edition by Jiawei Han, Micheline Kamber and Jian Pei (2012)
Data Mining with R: Learning with Case Studies by Luís Torgo (2011)
Note: Data Science = Data Mining + some marketing ;) Nowadays: Data Mining = web scraping
Did I use any editor?
Nope, you can 100% self-publish a book on your own, with patience and the Amazon self-publishing service.
Editors can help in the book structure, proofreading, marketing, printing, and distribution among others. It saves time.
A friend of mine surprised me with a Facebook ad campaign for the book when I launched it. Except from that, all marketing was done by word-of-mouth and some posts in the Data Science Heroes Blog.
Would you share it? ;)
How to create the book: Bookdown!
This amazing R package provides all the processes to create Kindle and paperback editions.
Get started with the minimum reproducible example at: https://bookdown.org/yihui/bookdown
The Data Science Live Book was 100% done using R and RStudio.
Only Bookdown should be a BIG point in any "Why R?" list.
You should google all of these terms before starting: Latex, Yaml, Knitr, R markdown, Pandoc, GitBook. None of them are Pokémon.
Check the RStudio lessons on what is R Markdown: https://rmarkdown.rstudio.com/lesson-1.html
Self-publishing on Amazon
Amazon runs a program called Kindle Direct Publishing (KDP).
Publishing the paperback version
You upload the PDF and Amazon will print on demand. That's it. You don't have to invest any money to buy so many copies before the release. After you publish, if one person from the Antartida buys a copy, then it is printed and delivered.
There are other print-on-demand publishers, like lulu.com.
The quality is excellent in both, but the color one is stunning! I see how colors help us to understand. However, the printing costs are around four times higher for this version.
Amazon will check several layout points before approving the release.
Check the color version:
And one from the black and white:
Note the quality of the plots and code layout —pretty important in a programming book.
Publishing the Kindle version
Easier to publish than the paperback.
The Kindle version of the Data Science Live Book, here!
(Amazon is incredibly vast, from printing-publishing books to host deep learning processes in AWS. Someday, Amazon and Google will be countries.)
You won't become rich publishing books unless you have a catchy title, like "Fifty Shades of Data in Grey."
Costs and earnings
There are two royalty options: 35% or 70%. We always want the higher, right? Well, in the 70% range, the book price must be US$9.99 at the most.
Printing costs depend on several factors. On this page, you will find how costs/royalties are calculated as well as a "Printing cost calculator" excel file: https://kdp.amazon.com/en_US/help/topic/G201834340
Amazon royalties are around 40% of retail price.
Typically, royalties when using a publisher are around 8–12% of the retail price. Source here.
Having a publisher/editor may facilitate several of things, so don't opt out only because of the earnings.
Publishing outside Amazon: Gumroad
Gumroad is a service that allows users to sell different types of files across the internet, e.g., music, videos, and data science books.
Gumroad provides a shopping cart and, after payment, the buyer automatically receives an email with the download link. It works really well! No one complains about the service. The pricing is affordable: "If you use the Free version of Gumroad, our fee is just 8.5% + 30 cents per transaction. If you get the Premium version of Gumroad for $10 (USD)/month, our fee is 3.5% + 30 cents per sale."
I started with the free version and then changed to premium.
One of the most useful features is that they allow embedding the payment form into your website. You can check mine here.
The other useful feature is name your price. The minimum price to download the Data Science Live Book is US$5 and the buyer gets all three versions: PDF, .mobi, and .epub.
While I was writing this post, I saw that 37% of buyers spent more than the minimum—I'm happy you like the project!
This is a list of unique buyers’ countries that bought using Gumroad—so it works worldwide.
Did you find the outlier? :P
Try always to share the book before its release.
Proofreading is needed at two levels: technical and grammatical.
Regarding the technical aspects, the proofreading was mainly done by Pablo Seibelt, Head of Data in Auth0. I have also made some changes based on people’s feedback. (Thanks!).
Regarding the grammar check, I hired several English teachers, to finally keep with one outstanding freelancer, Dr. Candy Pettus, from www.fiverr.com (a site to hire freelancers).
The tree was generated by an iterative and short algorithm.
Representing that simplicity is the seed of complexity.
Like the Lorenz Attractor.
And like nature itself...
Then, the designer, Barbara Muños, took the little tree and made the magic!
And, yes, the curve on the top follows the Fibonacci ratio, a feature present in flowers, art, the human body, and "a big etc."
There are plenty of cover designers on fiverr.com.
The International Standard Book Number (ISBN) is a unique numeric commercial book identifier. Publishers purchase an ISBN from an affiliate of the International ISBN Agency.
You will need an ISBN for each book version. In my case, I bought three: one for each of the Kindle, B&W, and color versions. Note: the content is the same in all three.
Amazon can provide you with a "free" ISBN if you sell the book only with them. If you want to sell in other markets, then you will have to get outside of Amazon.
"An imprint of a publisher is a trade name under which it publishes a work" Source: Wikipedia.
The imprint is defined when you register your ISBN. Being a self-publisher means that you can pick your imprint name (it can be your real name or a fictitious one).
If you self-publish and you bought an ISBN outside the US, then you have to let bowker.com know that you effectively own the ISBN you are uploading. Contact them by email for more information.
Disclaimer: I'm not an expert on the ISBN or imprint name, therefore, what applies in my case may not in yours.
Linking: B&W, color and Kindle on Amazon
If you are selling these three versions, you might except to have all of them linked. You will need to contact Amazon support to do this, because they only support one paperback linked to a kindle. Yet, the final result si some annoying...
Where "Paperback" links to B&W version and "Paperback, March 27, 2018" links to color version (🧟♂️).
Ask for adding the legend "Color" and "B&W" at the end of each title.
How to write good
I prepared a list of issues and Bookdown configurations to keep in mind that took me lot of time to find. Hopefully, it will save you time: https://blog.datascienceheroes.com/how-to-self-publish-a-book-customizing-bookdown/
Is it possible to self-publish a book? Yes, definitely!
- Github https://github.com/pablo14/data-science-live-book
- Web version: http://livebook.datascienceheroes.com
- Amazon B&W: https://www.amazon.com/dp/9874269049
- Amazon Color: https://www.amazon.com/dp/9874273666
I found this post useful on this topic: Writing an R book and self-publishing it in Amazon by Marcelo Perlin.
I will leave you with the back of the Data Science Live Book:
Thanks for reading! :)