A technical overview of Paperight site architecture

This is an unusually technical post for us: a description of what the Paperight website is made of. We’re very proud of the fact that Paperight is built from scratch using open-source technology, in part so that we can publish the code openly one day. (When we have the capacity to manage a public open-source project). I asked Shaine Gordon, lead developer at Realm Digital, to describe how the site is built.

From the outset, Paperight.com was designed with the goals of efficiency, scalability, speed and security.

This lead to the decision to use Java/JSP (GPL) as the starting point.

Apache Tomcat 7 (Apache Licence 2.0) was selected as the Java container, being the industry standard when it comes to ease of setup, and performance.

For the architecture, SpringSource’s Spring Framework (Apache Licence 2.0) was selected. This again was chosen due to its industry standard performance, efficiency, and large support community.

The front end runs on Spring MVC 3.1, using Apache Tiles for layout management. This is secured using Spring Security 3.1.

For domain object management and persistence, we chose to use JPA (Java Persistence API), backed by Hibernate, using JTA (Java Transaction API) transaction management to ensure data can be retrieved and persisted reliably.

For indexing and searching content, Apache’s Lucene was the framework of choice. This was then seamlessly integrated with Hibernate, using Hibernate Search (LGPL 2.1).

One of the core requirements was the ability to alter and watermark PDF documents on the fly. For this we chose Apache’s PDFBox (Apache Licence 2.0). Its ability to process documents quickly and it’s free, open licence made it the natural choice.

The backend management system requires large amounts of data to be processed, for example for product imports. This should be relatively transparent to admins, and also provide sufficient feedback on failed jobs and errors. To this end, we again chose SpringSource projects: Spring Batch and Spring Integration. Spring Integration is a Java “Spring-way” implementation of the famous “Enterprise Integration Patterns”.

The result is that Paperight.com runs on a stack of 100% free and open-source software, without in any way compromising reliability or performance. It could be argued that our choices actually increased reliability and performance, relative to proprietary alternatives.

Outlet advertising with the Paperight poster

Paperight best-value books, poster 0112Our outlets need a simple way to tell their customers that they can print Paperight books, and what books are available. So we’ve made a poster for the shop window. (Click the thumbnail to enlarge, or download the 1.2MB PDF)

Right now, we have about a thousand books on the site. So we chose about 50 that we think show off some of our best stuff so far. We’ve included a lot of high-school and undergrad setworks, some classic self-help, and the thing that gets us most of our Internet traffic: past papers for South Africa’s grade 12 exams. There’s also sheet music, resources for teachers and nurses, classic children’s books, sci-fi and fantasy novels, and seminal philosophy and psychology.

Any outlet that registers or asks for one will get a copy of the poster. It’s A2, and printed on lovely thick recycled paper.

On the back is a big, catchy cover and headline. Put it in a shop window and the catchy side attracts customers while the inside shows our fav-fifty books. Each featured book has a 3-digit shortcode for finding it quickly on paperight.com – just enter the shortcode in the search bar.

Also, any outlet that registers in Feb (and most of March we reckon, while budget lasts) will get $50 credited to their account, which will cover the rights fees for about 30 books. Just register and choose ‘Account’ as your payment method when ordering.

Paperight internships

If you’re a high-achieving individual looking for work experience, think about applying to be an intern at Paperight. We have a lot of work to do, a big vision, and we like teaching people useful stuff. Working with interns, we get more done, you learn about publishing, distribution, technology and a little finance, and the world gets a jillion print-on-demand bookstores.

Depending on your interests and skills, you might work with our content team finding and prepping books for distribution, our outlet team working with printing outlets, or our technology team building state-of-the-art web software. We won’t lie: it’s not glamorous work. But the people are fabulous, the coffee is superb, and we pay you a small but respectable fee. Internships are usually at least two months long.

To apply, email team at paperight dot com. Note: Your cover letter is much, much more important than your CV. You don’t even have to send a CV if you can link to a thorough LinkedIn profile (or something similar). If you have a blog or Twitter account, let us know – an outward-looking life, online or otherwise, scores extra points. We’ll then have a phone conversation with people we think may be a good fit.

We’re hiring a Customer-Relations Manager

UPDATE, 13 Feb 2012: We’ve renamed this position ‘Customer Relations Manager’ to make it clearer. Our initial title, ‘Outlet Champion’, could be confusing if you don’t know Paperight well yet. Same job, different title. This post has been updated to reflect the change.

UPDATE, 21 March 2012: This position has been filled. We’ll be announcing our new team members on the blog soon.

We need a Customer Relations Manager to sign up new outlets, and to work with outlets to make Paperight truly useful to them. Paperight’s success depends on our outlets: if outlet staff are happy, we’re happy.

Any business or organisation can register as a Paperight outlet. So outlets will include photocopy businesses, NGOs, schools, colleges, libraries and more. The Customer Relations Manager will

  • travel a lot to meet outlet managers in person
  • be the first to answer the phone or email when outlets get in touch
  • collect and analyse outlet feedback and website usage, to teach us how to serve them better
  • lead the planning and execution of marketing efforts.

And a bunch of other things. We’re a small team with specific goals, and we all contribute to achieving them in ways that can change from day to day.

We’re keen to hear from people who have worked with service-oriented retailers before, ideally in a sales or marketing role. But more importantly, we want to hear from people who believe in our mission: to put every book within walking distance of every home within five years.

This is a contract position for six months that pays R12000/month, with possibility of renewal for a further period. We’ll cover travel expenses too. The office has great coffee and even nicer people.

Apply by emailing:

team [at] paperight [dot] com

Note: Your cover letter is much, much more important than your CV. We want to know about you as a person in the cover letter, a motivation for why you want to work with us. If you have a blog or Twitter account, let us know – an outward-looking life, online or otherwise, scores extra points. We’ll then have a phone conversation with people we think may be a good fit.

There’s no deadline: we’ll update this post when the position’s filled.

Incentivising honest printing

A soon-to-be outlet manager, Terence, mailed me today with a common question:

I think [Paperight] is an excellent idea , just wondering how one would control the dishonest [outlets] who would buy one pdf and print many. I am sure you have seen the case of the copyshop in Durban central where they had hundreds of thousands of rands worth of textbooks they had copied. I think it was the fraud squad who arrested them.

Terence is right that dishonest photocopying of books is a common problem. (Sometimes it’s not deliberate dishonesty, but just copyright ignorance.) However, I think publishers have been approaching the problem in a one-dimensional way. The fact is, copy shops are meeting a customer demand that publishers and booksellers aren’t meeting. To me, copy shops should be publishers’ distribution partners, which is what Paperight will enable. So, the question then is, “How do we make it worthwhile for copy shops to play by the rules, rather than break them?” Technical restrictions and dire threats alone won’t do the trick – we can only do this by offering incentives.

When we provide a PDF for an outlet to print for a customer, we watermark each page with the outlet’s name, the customer’s name, the date, and a unique URL. Visiting the URL (on a computer or mobile phone) takes the customer to an online help-and-discussion forum for their book. For instance, students can talk to other students about the book or subject they’re studying. Entrepreneurs reading ‘How to start a business’ can talk to other entrepreneurs facing similar challenges. We can also use this online space for prizes and special offers. For some books, only the original purchaser of the document will be entitled to these services. That way, customers are incentivised to request their very own print-outs, discouraging copy shops from making multiple dishonest copies.

By offering this feature, we provide value to the customer. And we also get valuable feedback. We can track and analyse where our documents are from customers’ visiting the forums and offers, providing feedback for the publisher, and showing up potential trouble-spots where piracy might be happening. (E.g. if ten people use the same URL in an area, there are likely to be illegal copies there, and we can trace that back to the original copy shop.) This is simply not possible in conventional book distribution.

Will it eliminate dishonest copying entirely? No, there will always be a measure of that. The important thing is to offer a sensible, attractive alternative that’s as convenient as and more useful than piracy. iTunes did that for music, and Paperight can do the same for books.

We’ll also send a catalogue of the best content we have each month to all registered copy shops that we believe are playing by the rules. This will be in the form of a poster for their shop window to help draw foot traffic and, therefore, more printing customers. On the one side, a large headline grabs the attention of passersby, and on the other, we show the top fifty books on Paperight that month. Our first catalogue will go out in February. If you’d like to get one, register on Paperight for free.

An update to our rightsholder info and agreement

We’ve just made our first changes to the standard rightsholder agreement on paperight.com, based on feedback from publishers and legal advisors. We’ve added some wording to clarify important points.

We answer the question: Who’s granting what licences exactly? We’ve made it clear that rightsholders authorise Paperight to enable a very limited licence between the rightsholder and the outlet. Then we’ve added more detail on what an outlet’s customer is likely to pay in total for a book, compared to buying a publisher’s edition in a bookstore. And in the “Term and termination” section of the rightsholder agreement, we’ve clarified that during a termination notice period, documents may still be available for outlets to print.

Here are the exact additions:

  • In the Q&A info section, we’ve added: “What about my copyright? You retain full copyright in your content. You’re only allowing Paperight outlets to acquire a very specific licence. Each time a registered outlet requests rights for a book or document that you are offering through Paperight, the outlet is granted a very limited licence to print the number of copies of the document requested. If an outlet does not comply with the licence terms, its account with Paperight will be suspended.
  • In the agreement, we’ve added this to “Term and termination”: “Watermarked documents may still be available to print through the Paperight system during the notice period.”
  • And we’ve added a “Licences” section, stating: “When a registered outlet requests a copyright work (e.g. a book or document) on paperight.com, they are automatically granted a license by the Rightsholder to reproduce and distribute the requested copyright works which the Rightsholder has made available through Paperight. The Rightsholder authorises Paperight to exercise any of the exclusive rights granted by law to the Rightsholder in order to enable Paperight to make the copyright works available to registered outlets in terms of this agreement, including but not limited to reproduction, distribution and transmission of the works.”

All in all, no changes to the way things work – just important clarifications. As always, we love feedback.

Drafting a rightsholder’s agreement

I’ve worked as a client and consultant with a range of distributors, retailers and publishers, and they all have standard legal agreements to cover their relationships with each other. Reading any of them is like slogging through a Canterbury Tale, but without the ribald payoff.

I’ve recently had the tricky task of writing Paperight’s standard agreement with rightsholders. My aim is to make it as short and simple as possible, without including anything dumb or being reckless by omission. It will change over time I’m sure. You can read it here. I hope others will help me find its many faults and vulnerabilities, and we’ll fix them by making it simpler and shorter, and never more complex.

We’re hiring a content manager

Update 31 Oct 2011: We’ve filled this position – we’re pleased to welcome Tarryn-Anne Anderson to the team from 1 November. She’ll be posting here, too, soon.

We’re assembling a small, flexible, confident team of ambitious hard workers. They’ll be joining me (more about Arthur here) to form Paperight’s first staff complement.

One of them will prepare and organise the documents we deliver on Paperight. We’re calling that person a content manager for now.

We’re just getting going, so this is a four-month contract based in Cape Town, paying between R8000 and R12000 per month. We’ll see how things go from there.

If this is you, you’re likely to be just getting going, career-wise, and have a couple of years’ experience working with book-like documents – anywhere in any field, as long as you’ve learned to be organised, meticulous, efficient, and a consummate problem-solver. We can train you in the technical skills, so what matters more now is how you work and think, and whether we think the same way about being a success and changing the world. At Paperight, we’re setting out to change the way books are distributed in low-income communities. That social mission is as important as our financial one: to be a profitable business that enables other businesses to be profitable.

The content manager will work with rightsholders (publishers, non-profits, educational institutions, etc.) to get their documents onto Paperight. If you already know how to use tools like InDesign and Acrobat Pro, and can batch convert documents and edit basic HTML, you’re well set for the skills part. If you’re already used to working with rightsholders, talking through contracts, solving problems, dealing with uncertainty, and balancing the schlep of detailed legwork with the excitement of big-picture planning, then you’ve probably got the thinking part down too.

If this kind of work sounds interesting, apply by emailing team at paperight dot com. Your cover letter is much, much more important than your CV. In fact, don’t even send a CV if you can just link to a LinkedIn profile or something similar. If you have a blog or Twitter account (about anything at all), let us know – an outward-looking life, online or otherwise, scores extra points. We’ll then have a phone conversation with people we think may be a good fit.

There’s no deadline: we’ll update this post when the position’s filled.

Development road

Yesterday the Paperight site went live. I could (and often do) say it’s just an alpha launch with a small, pilot-testing selection of titles, but that doesn’t do the moment justice. It’s very, very exciting to be up and running, even if there is loads still to do.

So here are some of the key developments and additions we have lined up for the medium and long term.

  • While the current site runs (probably for several months), and we learn from user behaviour, we’ll be building a much faster, lighter version from the ground up. Our users will be busy copy shops often in low-bandwidth areas, and every second will count. We want to make point-of-sale transactions absolutely painless.
  • Currently you can pay for documents by EFT or by using PayPal. We’ll add a prepaid account, which will make it easier for copy-shop staff to pay for rights instantly.
  • We’ll reduce the document-delivery time from 24 weekday hours to minutes or less, at any time, by automating more of the backend PDF processing.
  • We’ll add a lot more content. At the moment, we’re preparing every document manually. We’ll start fetching and preparing PDF content by API in bulk.
  • For rightsholders, we’ll add automated sales reporting, and allow territorial distribution restrictions.
In addition, we’ll be learning from our customers and reprioritising accordingly. If you’re a Paperight user, remember that every single suggestion you send us is valuable. Don’t hold back, send us your feedback or pop it in the comments below.