Jul 13 2009
CloudCamp London 4: Dealing with data security
CloudCamp 4 London took place last Thursday in Microsoft’s offices at Victoria, continuing on success of the previous meetings. From what I could see, about 250 people attended, and the main overarching theme for the lightning talks this time was data security and using clouds for data storage.
Mark Cusack from Clearspace Software suggested clouds as a very good solution for data retirement, storing massive amounts of data that is almost at the end of its business life cycle. This kind of data is read-only and not queried as often as the one the business is working at the moment, but it still needs to be kept for analytical or regulatory purposes. Instead of keeping it in their data centres and wasting hardware and effort to maintain and retrieve it, companies can ship this off to the cloud and reduce costs. Key ideas to make this work are, according to Cusack, compressing the data before moving (and saving on bandwidth transfer costs) and querying the data on the cloud (not retrieving it back for analytics). As the data is compressed, we can make multiple copies and distribute it to multiple clouds for availability. Clouds have an advantage here as well as by compression we convert a typical IO-bound problem (storage access) to a CPU-bound problem (decompression) which allows us to use cloud resources efficiently. To be able to query this on the cloud and still keep it safe, Cusack suggested encrypting network pathways and data rest-points for queries, employing tamper-proofing and auditing with message digests. Although not ideal, Cusack said that sending the encryption key to the cloud only to execute a query and not persisting it anywhere might be a viable solution for a lot of companies, as it reduces the risk window to only the time when a query is executed.
Miranda Mowbray from HP presented an alternative solution for data security in the cloud – obfuscating the data for privacy. Her solution is in a prototype stage and involves encrypting the data so that it preserves the structure, and applications can still work with data directly. For example, dates can be shifted by an amount, customer names can be replaced with encrypted versions etc. The goal is not to have clear-text sensitive information on the cloud at all. In order to query the data, the query can be obfuscated as well and work directly on the data, with obfuscation software decrypting it on the fly as it comes back from the cloud. Mowbray said that this approach might not be secure against all attacks and practical for all applications, but that all but 2.5% of all reported privacy incidents happen on non-obfuscated data so obfuscation significantly reduces the risk.
Anish Mohammed from CapGemini then talked about clouds and security from an evolutionary perspective. As key challenges on the clouds today, he pointed out trust and managing multi-tenancy. Also, according to Mohammed, what matters most for survival from an evolutionary perspective is adaptability, flexibility and cost effectiveness/efficiency. Comparing huge in-house data centres as dinosaurs that are strong but inflexible and cloud deployments with mammals which are not that strong or secure but very flexible, Mohammed said that mammals are coming out of the woods and that there is a trade-off between security and usability, as security has computational costs, and that the ecosystem would on the end define security restrictions.
Phil Wainewright from ZDNet attacked the idea of private or hybrid clouds (a very popular topic at CloudCamp London 2). Arguing that private clouds are a bad idea, Wainewright said that “Connected clouds are the stuff of the future – captive clouds live behind the firewall and should not be allowed”. According to Wainewright, three key principles of cloud computing are abstracting horizontal elements to an API, cloud-scaling and being connected. In such a scenario, any improvement to abstracted services is instantly available to everyone. Custom private cloud architecture is going to start to miss out on improvements at some point. Connected clouds also benefit from community aggregation, as they have more users which means better tested infrastructure with more relevant benchmarks. Multi-tenant connections force it to share share connections as well as code, providing open APIS and mediated integration. Wainewright said that the future belongs to the “fitmost” (explained as most connected, not fittest individually), because openness to connectivity allows agile adoption of new resources and better mobility, concluding that “the more open you are to the cloud, the more easily you can connect to what’s out there.”
![]() |
![]() |



Thanks for the comprehensive summary of my lightning talk, it’s probably more articulate than my presentation! With hindsight, I wish I’d emphasised the importance of data compression more during the lightning talk. As I mentioned in my blog, http://tinyurl.com/mrlgfc, there were several comments throughout the evening about the impracticalities of moving large data sets in the cloud, all of which can be addressed by moving, storing and querying data in compressed form.