May 18, 2015
A Quest for Syncable Private Online Storage
It's necessary for apps to sync data, either documents or preferences,
among our devices. Syncable means that modifications made on one
device must be transferred to other devices swiftly. Private means that
data must be encrypted with a user provided key before upload to server,
so that neither cloud provider nor app developer can look inside your documents.
These requirements seem to contradict each other. Encrypted data is
extremely expensive to sync. Even if you just change one byte, newly
encrypted data will be entirely different, therefore a full syncing will
have to copy every byte. However, if we design the storage file to be
append-only, and use a stream cipher instead, then our goal can be met.
An append-only file is opened for reading and writing, only that the
writing always happens at the end of file. For C programmers, such a
file is opened in this way:
Since it's append only, it's easy to sync by comparing file size and
only downloading the missing data at the end, and another benefit is
that your data will never get corrupted. When things go wrong, we can
simply revert to earlier versions. Stream cipher encrypts data on the fly as they
are being appended. There is no need to re-encrypt whole file from the start.
Effectively we also have an encryptable
version control storage.
The problem of append only data storage is that, unlike usual database
systems, we need to build an external index file for fast queries. The
external index file has to be built on first time load, and always be
updated whenever there is new data coming in. It can also be
encrypted so even if other people get access to your device, your index
file is still safe.
Normally a remote server is required to help devices sync with each
other. The server only has meta information, for example, size of the
data file, timestamps of updates from clients. It can do some basic
conflict resolution. When a client tries to push or append new data,
the server requires the client to provide its local head position (same
as file size) and checksum. If client head position is not equal to the
head at server, then the server will reject updates from the client. The
client should catch up with the server head position first, by
downloading missing data and doing conflict resolution locally. Since
the server has no idea of the contents, keeping the content in proper
status is at the discretion of all clients. A badly behaving client
could post garbage data to the server. Even in such case, we can still
revert data to earlier versions, and revoke access permission for those
bad clients if necessary.
Like git, clients have full copy of data.
Therefore they can switch to another remote storage provider at will.
Data syncing can also be peer-to-peer.
Two different local storages can negotiate a common head
position by exchanging checksums of different portions of data, and then
try to merge their differences after that.
Going forward, this is how app developers should protect private data of
users, and this is how we can completely close any possible backdoors to
user data, yet still provide convenience of fast syncing.
Mar 20, 2015
Trusted Cloud Computing
We all know non-public data on the cloud servers should be encrypted.
What if data has to be processed right on the servers?
Data processing programs need to know about the encryption key,
however, we must only hand over the key to programs that we can trust.
Trusted programs are those we can build from source,
that means that we can embed one-time-use secrets to them.
Every time when we want to run a program on server,
a different executable copy with different secrets is uploaded to server,
and server should launch it as soon as possible.
The running program has to answer questions correctly, and shortly (to protect against secrets being reverse engineered),
before we can send over the data access key.
A trusted program must keep the key only in memory, never write the key to disk,
and should hide or destroy the key after use.
If it's restarted, it will have to ask for the key again,
then we will know something is wrong.
Open source programs are easier to be reverse engineered,
therefore we must add secrets to it in an obfuscated way
to make sure secrets can not be revealed by an attacker in a short time.
Depending on security measures, the access key must be invalidated or the data should be removed after a certain period.
Feb 10, 2015
Raspberry Pi 2 is a game changer
Since RPi is out, I have bought several RPi B and B+. They are very
useful as testing server. I also developed kiosk type commercial
applications using RPi, mainly for information display. These Pis are
usually attached to TV and online 24 hours everyday. They are very
robust, I haven't received complaints since they were shipped and
installed. However, on the other hand they aren't powerful enough, so
I don't think they are of any good for consumers.
Now the announcement of RPi 2
just changes everything:
- A 900MHz quad-core ARM Cortex-A7 CPU
- 1GB RAM
If those numbers don't mean anything to you,
then remember that this is a 35$ computer that
have the same computing power of 4 iPhone4.
You can't imagine what this tiny computer is capable of.
The first thing I should do is to port Aemula (486 emulator) to RPi2.
Then maybe try out any other interesting ideas on it.
If you have any thoughts, or that you need some custom applications based on RPi2,
please let me know.
Feb 07, 2015
BtStamp is an app that can timestamp important documents using bitcoin blockchain,
and it's now
available in AppStore.
How It Works
BtStamp pushes a SHA256 digest of your document into the bitcoin block chain, therefore creating a proof that the document exists at the time it enters the block chain.
Useful in these cases:
- Prove that a work is done, in the form of a deliverable file, before a specific time.
- Prove that you have previous knowledge before signing a NDA, or before a patent is filed by competition.
- Prove that a document, for example a PDF digital contract, is not tampered after it's signed and timestamped, by comparing its current SHA256 hash to the hash kept in the bitcoin transaction.
- Prove that a photo or a recording is taken before a specific time.
Secure and Private
BtStamp calculates SHA256 hash locally: the actual document will never leave your device. The proof is kept in the block chain permanently as a transaction. Even if BtStamp service is down or the app is unavailable, you can still search a registered document's digest on well known bitcoin websites, locate the transaction and find out when it entered the block chain.
The timestamp is done anonymously. No email address required.
When the time comes that you need to prove, you must be able to produce the original document. Then calculate its SHA256 digest with a third party tool, search the first 40 hexidecimal characters of that digest on a bitcoin website (for example blockchain.info), then you will find out the transaction, and the complete SHA256 digest in the output script of that transaction. The search can be avoided if you keep the transaction id in some place, but you do need to provide the original document and its digest.
- You must not edit or modify a registered document. Even the slightest modification will generate a very different digest, which won't match the one you have registered in the blockchain.
- You may want to email the registry info to yourself. The document could be attached too.
ARCHIVE | RSS
Jan 30, 2015
On App Building Solutions
Facebook's answer to cross platform app development:
A Deep Dive on React Native
I dragged through the whole presentation, still some good ideas caught my attention.
- Make use of platform tool kit
instead of emulating them with HTML5.
- Use a custom layout engine to build the view tree.
- Incremental tree rebuild.
Happy to see the industry looking for more pleasant ways of building apps.
I have also been thinking on the problem for quite a while.
My wish list for an ultimate solution (at least for the next 30 years):
- App can be inspected and changed on the fly, no need to relaunch for non-primitive modifications.
Think of how sculptors work, or even better, gardeners.
- Built-in revision control. git still feels too heavy.
- One source and can be adapted to different platforms. Think of how we use CSS prefixes for different platforms.
Not pretty, but useful.
- Auto recalculation, connection between state and display elements should be seemless.
Think of Excel.
- Text editors are no longer our primary tools.
- Stay close to metal.