From Physics to properties: Making the move to a UK rental start-up from CERN
By Guest Contributor Published: 19:05, 11 January, 2017 Updated: 19:05, 11 January, 2017
by Boris Mangano, Senior Data Scientist at Movebubble
When growing up, there were two dreams of mine; to become a physicist and to work in a start-up. Having spent over a decade in the traditional world of physics, working for CERN, the international particle physics laboratory, on the team that uncovered the Higgs Boson it was time to realise the second dream.
Whilst from the outside the move from physics lab to UK rental start-up may have seemed an odd one, it offered the chance to implement many of the skills and data analytic techniques I used as a scientist in a company where I could make some real impact.
The start-up I chose was Movebubble, the only renter-dedicated platform that enables users to move homes in just a few taps.
As an online application used by renters and agents alike, Movebubble had the scope to not only gather large amounts of data but also to utilise and analyse its data to help improve users’ overall rental experience for good. Renters can currently discover thousands of properties in real-time, book viewings and secure a home hassle free.
I saw a huge opportunity for how data analytics and machine learning could not only enhance what Movebubble was currently doing but improve its service to renters.
Machine learning bursting the rental bubble
Machine learning (ML) has become something of a buzzword over the past few years. In general, ML is the part of computer science that focuses on software able to learn without being explicitly programmed.
The two main applications are ‘classification’ and ‘regression’ and we currently use both forms to help renters.
Using ML for classification, we can predict genuine property listings from common ‘bait and switch’ property listings. During lulls in the market, it can be tempting for agents to leave unavailable properties on online markets to ‘bait’ renters and then ‘switch’ to offer them alternative properties.
With ML, we can classify which properties online are likely to be of this type and save renters from wasting their time by removing them from the platform.
Utilising ML for regression models enables you to predict quantities from a set of inputted data.
Regression application of ML at Movebubble could enable us to predict the ‘fair’ rent of a property given its location, the number of rooms or type of furniture in the property.
This will be of particular help to users who are perhaps moving to a new city where they don’t know much about the area, or for users who have a particular set of property criteria.
Whilst you could write classification and regression programs explicitly, using a long sequence of if-else-then statements, utilising machine learning is far more efficient.
The programmer acts as a supervisor, giving the machine a series of examples and the corresponding correct answers effectively ‘training’ the machine until it learns how to make generalizations.
Once it has learned for example, that average rental prices in Clapham for a three bed flat are £1,500 a month, it is able to make predictions about previously unseen data, like new similar properties recently added to the market.
In order to improve the users’ experience of renting in the UK, my job is to first harvest from the platform the data that contains the most valuable information about the problems facing renters today.
After writing the initial algorithms, monitoring that the machine is learning is essential, as it understanding the pace at which the machine is learning.
Does it require additional data to improve its performance or increase the speed it is learning at? Has it already reached the point of diminishing returns? Often the answer is not more data, but different type of data.
Also important to get right is the type of machine learning algorithm that best fits the problem you are trying to solve.
There are a long series of different algorithms, random forests, support vector machines, logistic regression, shallow artificial neural networks etc. that have been used routinely for at least 20 years now in different industries. For the business tasks faced by Movebubble currently, conventional ML algorithms are more than enough for the time being.
I believe more advanced machine learning as deep learning should only be used for specific tasks that really require the ‘fire power’. Otherwise the overhead costs associated with the implementation and the training of the algorithm doesn’t provide sufficient ROI for its results.
Click ‘Next Page’ to read about renting powered by artificial intelligence.
Monthly Review: Welcome to the digital-dependent society on data centres, cloud and data
By João Marques Lima Published: 10:08, 30 January, 2017 Updated: 10:08, 30 January, 2017
The first month of 2017 has not disappointed. We saw big acquisitions, large data centre openings, many partnerships and a shifting geo-political landscape that is shaping the industry for the months ahead. Here’s the highlights.
January was a month of big figures. $3.5tr, the amount expected to be spent on IT in 2017. $175bn, the amount to be spent on data centre systems. $48bn, the predicted colocation and wholesale data centre market revenues by 2021.
The values show the ever growing digital-dependence of societies around the world on the data centre, as the data economy starts to deliver its first results.
However, we also saw this month that despite businesses’ dependence on data centres being on the rise, many are still lacking the knowledge to identify the services they need and spot misleading providers.
We covered the story of ServerLoft, a Brazilian IT services provider, which left 16,000 customers without access to their data in the country’s “largest digital blackout” ever.
The case is far from over as the lawyer in charge of representing several customers exclusively explained to Data Economy here. Caught up in the controversy are other companies such as Equinix, Dell, Juniper Networks and VMware. We will continue to follow this story.
Also in Brazil, a power cut is said to have been the reason of an IBM data centre going offline, with some customers experiencing up to eight hours of downtime.
On the other side of the ocean, in Europe, data centres were also hit by a power outage in Amsterdam which killed two people. We spoke to local providers Equinix and Switch Datacenters on the incident.
A lesson from this outage, came however from the CEO of The Data Center Group, whose facility in the Dutch capital experienced mechanical problems during the power cut.
CEO Siemon van den Berg was quick to react and inform customers, proving how crucial it is for companies today to be transparent and constantly in contact with their clients.
This month we also found that the UK is going for a ‘hard-Brexit’, and saw many moves influenced by this. For example, a subsea cable linking Marseille to New York is avoiding “the chaos around Brexit” with those in charge telling companies to “avoid the UK completely and go directly to New York”.
We also heard a Microsoft manager hitting at the possibility of the company moving its data centre focus elsewhere, should Brexit-UK increased tariffs. Microsoft was quick to react saying the comments were not “reflective of the company’s view”.
But not all was bad. French cloud provider OVH finally lifted the veil on its UK data centres, with three set to open in East London in the coming months. In its announcement, the company labelled London as the tech city of Europe. Can Brexit destroy that?
As industry leaders pointed out at the Finance and Investment Forum 2017, the UK will not close down for business after Brexit and it is still early to say what is going to happen. ‘Uncertainty’ seems to remain as the keyword.
In the Nordics regions, the industry keeps accelerating and taking new disruptive approaches. For example, Stockholm has set a goal to have 10% of the capital’s heating needs powered by data centres. (Air France has also used heat from its data centre in its offices.)
Facebook has made headlines for finally confirming it is building a data centre in Odense, Denmark. We explored why the web scale giant has chosen Denmark for its third non-US data centre while it aims for five billion users worldwide.
940Km away, in Vilnius, Lithuania, a shocking decision by the government to halt the construction of a data centre over fears it could be used by Russian hackers to spy on the county caught many by surprise. Developers have appealed against the decision and promise to fight for the hub to be built.
The decision was taken in the wake of a debate in the US over fears Russia had interfered with the elections in November 2016. And this brings us to one of the biggest events of year: the arrival of Donald Trump at the White House on January 20.
In one week, President Trump has reshaped American politics and caused much stir amongst different sectors, including the technology one which has seen bosses at Facebook, Microsoft, Google, Apple, Tesla and more blast out against some of the policies. Data Economy has run down a list of things that will not happen under Trump’s presidency to give some peace of mind to the sector.
In the data centre space, the US government is on a journey to reduce IT costs in the Army, but plans seem to have stalled forcing the Army Secretary to intervene.
Elsewhere in North America, we saw HPE invest $650m on data centre startup SimpliVity, Facebook’s new Los Lunas data centre is predicted to generate a $2bn local economy, Equinix secured $1bn to fund its Verizon acquisition, and Switch Supernap got a critical yes for its $5bn pyramid data centre.
In the MEA region, Ooredoo expanded its data centre in Qatar while in Tanzania, the government is preparing to ban all data centre builds by the public sector in a push to force them to use a Tier III facility run by TTCL.
Lastly, the APAC region which wants to place itself as the next leader for digital services and China seems to be taking the lead. The country has launched a $14.6bn investment fund to become an internet superpower.
Also in China, Hong Kong and Shenzhen put a 20-year-old border dispute behind their backs and announced the construction of a 13 million sqf innovation and technology park days after Huawei moved its data centre out of Shenzhen, which awakened old rumours the company is looking to relocate its HQ.
In India, Sify Technologies, which works with 43 data centres in the country, received a warning from NASDAQ on low shares price. The Indian data centre infrastructure market has also been predicted to be on a 3.72% CAGR until 2020, when it should top $2.45bn market cap.
In Malaysia, data centres are to profit on $244m, while in South Korea the Tata Group is readying to build a data centre aimed at the country’s $1.8bn connected car market and in Singapore, Singtel opened a $280m 570,000 sqf data centre.
We close this monthly roundup going back to the Word Economy Forum, held in Davos and where tech executives, celebrities and world leaders, pointed to the possibility of social unrest worldwide if governments fail to deploy technology while keeping humans part of the revolution.
Here’s Data Economy’s January 2017 top most read:
Receive this monthly review, every month by email by signing up here.
Identity Resolution – the must-have marketing watchword for 2017
By Guest Contributor Published: 17:14, 20 January, 2017 Updated: 17:14, 20 January, 2017
by David Barker, Global Product Director, Acxiom
Identity resolution; may sound like a TV crime drama or Hollywood blockbuster, but instead it is set to be the watchword for all integrated marketing strategies from now and forever into the future. In a nutshell, it means being able to identify and unify a customer’s interactions with a brand, across all touchpoints.
This is one of the major concerns facing many marketers today – in a world where a single individual could be interacting across desktop, tablet, mobile, real world visits to a location, a plethora of different social media channels, and of course the long established advertising media and contact channels of call centres; ensuring that interactions from one user can be connected across all of these methods is perhaps one of the most pressing challenges for businesses.
This is precisely the topic of a new Forrester paper, entitled The strategic role of identity resolution in which the research and advisory specialist discusses just why identity resolution is such an imperative for businesses, and marketers of all sizes.
When consumers move from one technological channel to another when interacting with brands, they naturally expect the brands to be able to move with them, seamlessly.
And as the power of the You Know Me society (as this trend has been dubbed) grows, customers will become increasingly irate if a business persistently does not recognise them as a valued customer, or indeed a brand new one who should be tempted over the line with an introductory offer.
The root of efficient identity resolution lies, naturally, in data. When a consumer ‘presents themselves’ at a touchpoint, there are various identifiers they carry, some traditional some new.
From your name and address to your email address, from cookies and your ‘digital fingerprint’, each has potential in helping recognise you as an individual. Before we continue further, it must be stated that in all cases, the data, regardless of form, needs to be collected and used in privacy compliant ways.
Consumer trust is at the foundation level of all data-driven marketing and identity resolution should seek to build it through a better customer experience, not erode it.
Just having data, however important it is, is the starting point of a more complex process. Without over-complicating the principle, the next key step is to create and manage an ‘identity graph’.
This is a range of identifiers related to an individual so that when an interaction occurs, the identifiers from that device and channel match those in the identity graph and return what can be best described as a link to allow the database, system or other to connect data that relates to this single individual together.
Do this and you have more of the right data connected to the right people, meaning a better customer view, better insights, marketing and a better customer experience. Good business.
One key subtlety and a major challenge for marketers is ensuring we recognise an individual but not necessarily the specific person by name, something that speaks to the difference between PII (Personally Identifiable Information) and de-identified data.
To ensure marketers operate within the privacy legislation, we need to respect the fact that while consumers are active in the digital space, unless they chose otherwise, they often wish to operate without their identity being known.
This is a key part of leading identity resolution capabilities. Done properly, while we may know John Smith has bought a new TV from the brand’s PII, he may then go into the digital space and visit a site, without identifying himself, that features advertising.
The critical ability is to match John’s digital identifiers not to him as John Smith as a person, but to another encrypted (or hashed as it is known) identifier that has a TV purchase associated to it.
This means that the brand he bought the TV from can decline to show an ad it may have otherwise shown or, can chose to serve a complementary ad, perhaps one for a sound bar or Blu-ray player.
The data has been associated, John’s identity has not been revealed in the digital space, the brand delivers a better message and John enjoys a better customer experience.
What’s in store for storage in 2017
By Guest Contributor Published: 13:53, 19 January, 2017 Updated: 22:27, 31 January, 2017
by Boyan Ivanov, CEO at StorPool Storage
Looking into the future and making accurate predictions at first glance can be a daunting prospect. With this in mind, there are several clear emerging trends for the data storage industry in 2017.
I believe that innovation, software-defined storage, education and consolidation will be key themes for 2017 and here I explain my rationale for reaching this conclusion.
Innovation continues to be essential for future growth. For end users to extract the most value from their IT infrastructures in 2017, in my opinion it’s not about sourcing a particular product or technology. These will be reviewed of course, but the best way to extract the most value from an IT infrastructure is to innovate.
To get the most from its infrastructure an organisation must constantly develop, and accept and be ready to implement change. This will allow it to move forward, improve as a business, and grow both profits and market share.
Among the most significant trends, I believe, is that software-defined storage (SDS) will finally get a solid foothold in the market. It generally takes about five to eight years for a new product/solution group to deliver.
SDS has been fighting to join the mainstream since 2013, but until now it’s been the domain of innovators and early adopters. I expect SDS to move out of that niche market in 2017, and follow the model Geoffrey A. Moore set out in his book “Crossing the Chasm.”
Moore’s next market group – the early majority – is chomping at the bit when it comes to SDS and it wouldn’t be unreasonable to predict that the technology’s market share could rise to more than 20% in 2017.
In turn, this will drive even greater adoption of SDS in 2018-2020 whereas other technologies such as 3D NAND – where flash memory cells are stacked vertically in multiple layers for scalability – and NVMe will need a couple more years to get there.
A knowledgeable work force is critical to delivering the right solutions to end users. The world is changing and the same is true of data storage. Storage professionals wedded to the old way of doing things are going to struggle to keep abreast of the ever-increasing demand for faster access to and storage of data and that will impact not only on an organisation’s storage systems, but on the wider IT infrastructure and the business as a whole.
There are many competing solutions in the storage market, with many vendors advertising the same benefits, each with their advantages and disadvantages. It is hard to find deep, good quality information and understand the real-life differences between seemingly similar products. It can be mind-boggling.
Storage professionals need to “go back to school:” spend time, money and energy to re-educate themselves, so that they understand forthcoming trends and understand which technologies are most suited to their organisations’ needs.
Finally, I expect further consolidation in the storage industry. Too many vendors with weak products were heavily funded by venture capital money between 2010 to 2015.
The truth is, while suppliers can buy a lot of marketing and mind-share, they cannot make a customer buy the actual product or even stay loyal if the solution does not work well. I do expect consolidation to result in fewer storage vendors in 2017 and that better products will start to make a name for themselves.
To conclude, in 2017 we are going to witness more significant changes in the data storage sector throughout the whole chain from vendor to end-user with software-defined storage gaining more market share. It’s going to be an interesting year!
Ransomware extortion: The fightback starts now
By Guest Contributor Published: 22:49, 8 January, 2017 Updated: 22:51, 8 January, 2017
by Jeff Denworth, SVP Marketing, CTERA
Ransomware has risen to global prominence in the past year as simple malware creation programs and encrypted payment methods make it easier and more lucrative than ever for criminals to hold data hostage. With US$1bn at stake this year organisations need to set a strategy for beating ransomware, writes CTERA’s Jeff Denworth.
In 2016 we have seen an exponential spike in ransomware activity. According to the FBI, ransomware attacks have increased 35-fold this year over 2015, resulting in an estimated US$209m paid out to cyber-criminals. If the growth curve continues, ransomware is on track to be a $1bn business in 2016.
Ransomware: a particularly insidious form of malware that holds its victims’ files hostage until a “ransom” –
typically ranging from hundreds of dollars to hundreds-of-thousands of dollars – is paid
Ransomware payments are collected by criminals via anonymous bitcoin transactions – where it can cost anywhere from $500-$2,000 to unlock an average PC. The anonymity makes it’s difficult to know precisely how many anonymous payments have been paid, but no organization is immune, as attacks have targeted hospitals, schools, government, law enforcement agencies and businesses of all sizes.
The growth is worrying but even as companies seek to combat it, ransomware criminals are developing new approaches. Two especially nasty tweaks to ransomware are starting to emerge:
- Certain cyber-criminals are capturing data that ransomware can copy out of your network for the purposes of selling it to interested third parties, enabling industrial espionage.
- There have been reports of customers paying ransomware attackers but not receiving the encryption keys for decrypting their PCs in return.
It’s a developing situation but that should not mean that organisations delay putting in place counter measures to fight crypto-malware. CTERA advocates following three key steps:
Step #1. Secure your perimeter to minimise the chance of breach:
- Patch your operating systems and keep your operating systems up to date.
- Train employees on ransomware and their role in protecting the organisation’s data.
- Disable macro scripts from office files transmitted over email.
- Limit access to critical and rapidly-changing datasets to only need-to-know users.
Step #2. Back up all files and systems to avoid paying ransom to recover from crypto events.
- Back up your endpoints, back up your file servers.
- Implement lightweight, optimised data protection tools that minimise recovery points.
Step #3. Roll back to most current data using sync.
Steps #2 and #3 are intertwined; the most effective way our customers have found to mitigate a ransomware attack is to combine enterprise-grade data protection tools with file sync technology.
This combination of backup and sync might not seem intuitive at first, especially since file sync and share is increasingly being viewed as a form of backup. While we don’t entirely agree with that view – given the need for backup tools that can protect entire systems and system profiles – sync does play a crucial role in ransomware remediation.
Consider that legacy backup software typically offers backup intervals – that is, the amount of time between backup cycles – of 12 to 24 hours. Essentially an entire business day or more becomes subject to loss when an organization “rolls back” to a non-infected state using traditional backup tools. Even the most efficient modern backup solutions have default backup intervals ranging anywhere from four to eight hours, which is nearly a full business day. Therefore, the same problem could essentially persist.
This is where sync technology provides an “event-based” data protection component that mitigates the blast radius of a ransomware attack. Enterprise File Sync and Share (EFSS) tools create incremental versions of files as they are changed and updated, and are protected on an event basis (a file save) as opposed to a scheduled basis (a pre-defined backup interval).
CTERA Enterprise File Sync and Share can publish and version file updates in less than five minutes. In the event of a ransomware attack, customers can recover systems and workstations with CTERA Backup and then recover to versions of folders that were stored in CTERA EFSS to easily and quickly recover to the most recent file state.
So, yes – back up everything you can: your systems, servers, databases, etc. But by adding file sync into your ransomware protection strategy, you can minimize your business outage while saving hundreds of thousands of pounds in ransom.
The only way we can put an end to the further spread of ransomware is by building the right safeguards that eliminate enterprise vulnerability and end the need to pay cyber-criminals to access our data and our systems. Whether you choose CTERA tools or any number of other approaches to safeguarding your organisation, decide to be prepared and please don’t delay.