Tuesday 15 September 2015

Data Myna: Digital Data and Analysis


Kathryn Kure of Data Myna

Data Myna will do whatever it takes to get or derive, then verify, clean, code, post-code, transform and analyse high-quality data you can trust to enable a decision-maker to make fact-based decisions.  


Kathryn Kure is a trained human sciences researcher and analyst or "Think" Marketer. However, from the get-go she has worked as an applied researcher and as a "Do" Marketer - 

This combination of thought and action means Data Myna focuses on actionable results. 

For an up-to-date CV and recommendations, click on the LinkedIn profile of Kathryn Kure

Kathryn is an up-to-date member of the Southern African Marketing Research Association


How to contact Data Myna:

Email: kathryn@datamyna.com 
Phone: +27 83 2520992




Use the Data Myna for actionable results



Beware: Garbage In Means Garbage Out

This is the Age of Information: we have more data about more things – and people – than at any time in human history. And, with Technology, we can access that data…which should make our marketing, and our businesses better, because we know lot more about our customers. Right?

Wrong. We are in danger of drowning in data because we cannot interpret it and use it to make good business decisions.

Let’s look at the Internet and other forms of digital marketing, for instance. Proponents of digital claim that the medium is the one which offers the best feedback, because the data is available, instantly. But precisely what data is available, how clean is it, and how much does it really tell you about your consumer? Big data is usually also "dirty data".

Data is very simple: you either have it, or you don’t, it is either accurate, or it is not.

When you don't have the data, or the data is highly inaccurate or "dirty", no statistical technique in the world can generate an analysis you can trust for decision-making.

We have a word for it: GIGO – Garbage In, Garbage Out.

Data is a profoundly human artefact – you can’t step out into a field and simply gather data as you would pick a bunch of grapes.

Instead, you have to determine what you want measured, how it should be measured and then put in procedures to measure it, prioritising, of course, the data that is most important for your business.

There is a surprising amount of inaccurate or “dirty” data out there, and companies must be sure they are making business decisions on accurate or cleaned data.

Dirty data is often caused by human error. Sometimes it’s a design fault of a system that does not undertake good enough data verification.

Some error is almost unavoidable: for instance, humans will make spelling mistakes – which is why Google provides a drop-down list surmising what correctly spelled word or words you are searching on as you type.

Other errors relate to the fact that humans will always cheat or take shortcuts when they can. Sales clerks, spurred by the desire for commission, rapidly learn they can get around the need for a client ID by entering in their own ID, over and over and over again, if the system enables duplication without verification. This generates dirty data that is almost impossible to clean. Sometimes, it is more efficient and effective to discard dirty data than even attempt to clean it … just because you have data doesn’t mean you should use it.

Bad data inevitably leads to bad decisions, while acknowledging you don’t have enough of the data you need goes a long way to fixing the problem.

Sometimes the data is accurate, but it doesn’t tell you what you think it does.

Take, for instance, something as simple as web traffic. It’s easy to measure, but what are you measuring? Traffic emanates from both humans and computers (bots or spiders), so you need a way of splitting this data into groups: human traffic this side, computer-generated traffic that side.

The behaviour of humans and bots differs (the latter is on the site for a fraction of time of the former, for instance), so it is possible to filter out the non-human “users”.

The technical term is "disaggregation" - like segmenting consumers into groups of interest, so you must categorise your data into groups of interest.

Another issue with traffic is that of simple geography. If you are a local company, traffic from outside your country or region is generally not useful to you. Do make sure these numbers are separated out before you undertake any analyses.

A web analytics package, like the one offered by Google, provides a veritable cornucopia of data and analytics.

But to understand it fully, you have to work with the data, transforming it into something that is useful to your business and answers your business questions. You must use its tools to separate your data into the categories of interest.

If you don’t do this, then don’t expect the data to help you make good business decisions.

Web data has a number of limits in terms of consumer information.

One limit is the problem of online privacy. There are a number of online identification and tracking technologies available, from cookies to tracking pixels in emails, which help you gain insight into your potential or actual clients.

However, collecting some types of information passively without explicit, prior, informed consent violates consumer research ethics and may be illegal in many EU countries.

Both marketing and marketing research require consumer confidence, and the costs of client distress if they discover what they may consider “spyware” is simply not worth any potential benefits.

In some parts of the world, even the harvesting of Internet addresses is considered personally identifiable information.

Consumers are also becoming wiser and more savvy at protecting their data: there are a number of techniques to ensure companies cannot harvest data, from deleting browsing history regularly, or blocking specific cookies while enabling others, as well as ensuring images are not automatically downloaded in emails to avoid tracking pixels.

This means that if a large enough number of consumers are avoiding being tracked, you’re not getting a representative sample showing up in your analytics. And you might lose out on information about the demographics and the interests of the people you wish to target, and be making decisions on unrepresentative data.

Despite the wealth of data and the apparent ease with which companies can get hold of it, the very nature of data requires that you interrogate it since lots of dirty data on the complex, multivariate creatures that are humans requires a specialist skill-set to interpret.

Finally, you must also relate the data to the outcome of interest. Facts are stubborn; no amount of data manipulation can conjure up sales and customers, but it’s amazing how bedazzling other meaningless metrics can be.

Kathryn Kure undertakes independent, third-party analysis of web data and analytics. She is an independent member of SAMRA. www.datamyna.com
This article was first published on page 17 in the Saturday Star of the 12 September 2015, and republished with permission 






Thursday 11 June 2015

Brands should not miss the Google+ Bus


The Data Myna 

Why is everyone in SA ignoring Google+? It never seems to generate a blip on the radar of social media analysts, the Plus button is often absent from websites, yet it has, according to GlobalWebIndex, a significant number of active users in this country. And its new strategies enabling publishing to niche audiences could make it an even more effective platform than Facebook for brand-building, provided you get your content strategies right.

Monday 1 June 2015

Privacy, Dark Social & ROI


Privacy, Dark Social & ROI

Research & Insights – by Kathryn Kure


Lars Basche, EMEA Digital Lead at Text100 Global Communications spoke to Kathryn Kure of Data Myna about privacy and dark social media, consumer rights, branding and many other items relating to social media marketing and analytics.

The first question that was asked of Lars had to do with how do companies actually calculate their ROI on social media, given the increased consumer need for privacy and concomitant behavioural changes that this gives rise to, such as the use of  Dark Social (such as online chat and old-fashioned email which is less easy to track) and/or regularly deleting their search history, amongst other strategies.

Lars responded that is definitely an issue and, certainly in Germany, where he lives and works, in order to include social data into your CRM it is legally required that you have to get the OK from your customer and only then can you use their data. Although it is very easy to get loads of data from your customer, permission is required for you to be allowed to use that data.

Furthermore, from a regulation point of view – if people are not granted this ability to control what companies gather - they will be increasingly hesitant to talk about themselves online.

If You are Not Paying for the App, You’re the Product
Lars was then asked how this applies to smartphone apps in particular since, either you have the app, and grant it permissions which it decides upon, or you don’t. There is currently no middle ground and, once you click on “Accept”, they typically go on a major content grab and then initiate other services, such as ‘Fine Location’ tracking (which tracks you via cell-phone towers even if you’ve switched off your GPS).

Lars believes that, in the future, these user settings will be changing, and become far more fine-grained from a user perspective, in terms of agreeing to enable each and every option - or not -  item by item. In particular, governments are increasingly requiring app providers to change their way of operating. In general, however, while public awareness of privacy issues is growing - particularly off the back of Snowden’s revelations - at the same time, if you are not paying for the app, then you are the product; how companies negotiate permissions with customers is going to be fundamental to their success, going forward.
So, the issue at the heart of it all is simply that - those who make the apps we all download for free and then use -  need - somehow to make money. Hence, as with traditional advertising, they defray their operating costs through utilising the information gained to sell adverts to consumers. Hence, you do pay – but indirectly and consumers are increasingly weighing up the costs and benefits involved.

What Do ‘Free Apps’ Mean for Competition?

One of the issues this raises, though, is that the little companies are increasingly unable to compete against the established behemoths - which leads to monopoly, winner-take-all situations, and where the smaller companies, in order to compete, attempt to grab more and more content in what one German commentator has referred to as a “cowboy capitalism". So, while start-ups may figure it’s better to ask forgiveness than permission,  as they go about ‘breaking things fast’, this then leads to a consumer backlash as disclosures are made relating to how information they are unwittingly giving out, given that the idea of consent is predicated upon that of being fully informed.  

So the question Lars was then asked is, is user behaviour changing and ‘dark social media’ (i.e., the use of old-fashioned email, for instance) growing – and, if so, what are the implications for analysts? In other words -

Is What You ‘See’ What You Get?

In the early days of personal computing, the acronym WYSIWYG (what you see is what you get) was coined to indicate the transparency of certain computer products, where what you saw on the screen is pretty much how it printed out.  The question to analysts, however, from a sampling point of view is that, while you always have this cornucopia of data from certain individuals, are these individuals actually representative of the population at large? From an analytic point of view, is what you ‘see’ and therefore ‘get’ actually all there is to see?

That is, is the online population whose behaviour you can track easily different from that population which uses private groups (such as those set up in Google Plus) and dark social to communicate and share web-links and other information?

Lars felt that that depends on the relative size of the different groups and, more importantly, if these groups differ markedly from one another. There are commentators who believe that people increasingly do not worry about online privacy. Or, as others would put it, “If you want online privacy, don’t go online”, but this certainly is something companies absolutely do need to bear in mind when analysing their data.

Private Online Spaces

Lars went on to say that, since public awareness of the needs for online privacy is growing, and their behaviour changing as a result, then moving towards promoting private spaces or engaging directly in private communities may become a strategic necessity for companies.

Fragmenting the Social Media Market:

Effectively, though, we are looking at the concept of market fragmentation, and there are two issues here: the first is that, on sites which enable private communities, such as Google Plus, then privately posted content is difficult to track at scale (unless a third party app is granted explicit permissions – but which not all consumers may grant) and secondly, your younger users in particular may engage in behaviours in which they ‘hide in plain sight’ through use of in-group argot or jargon, thereby disguising their true intentions, as danah boyd and Alice Marwick point out in their 2011 paper, Social Privacy in Networked Publics: Teens' Attitudes, Practices, and Strategies.

It is fascinating to note how teens “develop intricate strategies to achieve privacy goals" and they quote Nancy Fraser, who notes that repressed groups often create “‘subaltern counterpublics’ which, from a civic engagement perspective, can be understood as ‘parallel discursive arenas where members of subordinated social groups invent and circulate counterdiscourses to formulate oppositional interpretations of their identities, interests, and needs’”.  To give concrete examples of how teens are evolving strategies to enable some kind of privacy in public networks, they have been documented using strategies such as:
·       segmenting friend groups by service (which is not that easy if there is a social expectation that you 'must' be on a particular service);
·       deactivating the account daily - which means you can send messages or leave content only when logged in;
·       always deleting others' comments on your page after you have read them and deleting your comments on others' pages day after posting them there; 
·       or engaging in what boyd & Marwick call “social steganography” since “‘steganography is an age-­‐old tactic of hiding information in plain sight, driven by the notion of “security through obscurity.’ Stegnographic messages are sent through channels where no one is even aware that a message is hidden”. For instance, referencing the Monty Python song, ‘Alway Look on the Bright Side of Life’ knowing full well your mum will think you’re cheerful but your friends will pick up you are, in fact, depressed.

Which Social Media Platform Do You Use?

Lars acknowledged the difficulty of the fragmented social media landscape that, currently, Facebook has no real competitor from a user point of view, which has to do with family and friends, however, other platforms offer interesting opportunities, though possibly you need to craft different messages to different groups of people. Facebook is evolving into a paid media platform, and with mobile ads their revenue has increased a lot, but other sites, such as Twitter, LinkedIn and G+ you are not getting typical marketing activities occurring, they provide more of a pull than a push medium. Twitter enables people to be active and communicate in real time; Instagram has very heavy users and younger users, as a demographic; Google Plus, on the other hand, has a strong focus on good content and interaction and it is useful to use G+ from an SEO point of view in that your posts there are more visible than those made on other platforms.

However, regardless of social media platform, the main issue Lars felt needed to be iterated (again and again and again) is simply, this, that -

Content Remains King (and Queen, and Jack of All Trades ...)

Lars felt you cannot emphasise enough how content is fundamental to your social media marketing efforts. So, although you can currently still buy followers on Twitter and Facebook, you soon enough realise these are fake accounts, and that, in the future, such purchases of followers will become much harder to manipulate.

But, above all, in social media, there is a huge issue with ‘signal vs noise’. So, how do you control the noise, given a focus on pushing information out? Is it better to have fewer, stronger signals over and above plenty of clamour and noise that simply adds to the endless chatter?

Lars responded that success is often gauged by the fact that, in an event, for instance, you are mentioned more than the competition, but that – if there is a lot of noise, it doesn’t mean a lot. Instead, brands should rather think of market enablement, whereby they must focus on their own networks, not just create a brand platform on Facebook with hundreds of thousands of other brands, but whereby they evolve this into being able to talk to a targeted and select group of people.

However, the question always remains as to how to access them and how to have a conversation with them.

Lars responded that, particularly when it comes to social media for sales, you can help the sales department to use social media to add value,  by helping them to use social media for information and to create content and reach out to people and identify people who are influential - the influencers and analysts out there. So that change we are seeing is the focus on how to reach people with whatever is relevant. Hence, enabling sales departments can help them also with lead generation - provided these are quality leads and you can link people to their address and measure success in the real context of the sales person.
For instance, who are the top 10 influencers in the field? Are they amplifying your message? So, instead of sending out 10 000 messages, rather check if they are relevant and use influencers to amplify these messages.

Lars advocates the use of various tools to identify the influencers, as well as desk-top research, but various social media monitoring tools such as Radian6.

Integrated Communications

More than anything else, however, Lars felt there is a strong necessity for integrated communications which means you need to integrate the different communication departments into one hub, so whether your personnel are in sales or HR or external communications – they are all, in fact, PR for the company. In an ideal world all departments would function in an integrated way and also be reaching out to customers. There will always be customers who have problems, products or services that don’t work as promised; it is important to find unhappy customers and react quickly to their discontent.

In other words, tone is really important to keep consistent across all channels; regardless of whether it is earned, paid or owned. It is important to integrate your advertising into it, use social media to engage with influencers, and, given how your social media activities are increasingly about customer acquisition as well as retention, from a strategic thinking point of view, you have to be more than simply just active, but you actually have to know what to do. So, for any content you put out, you need to know, is it relevant, is it strategic? The biggest gap in social media communications is often the strategy – are you doing the same as you always have done, or are you using your digital communications strategically?

No Excuse for Avoiding Strategy

Lars felt that it was fundamentally important to emphasise that, for companies, they need to define how to communicate and what to communicate, and what the goals are – which goals must be more than simply gaining more Facebook likes; if you as a company don’t understand your brand and what it stands for, you’re doomed to failure – and not just in the social media space.

As Michael Porter put in aptly named article, There's No Excuse for Avoiding Strategy:
"If you don’t clarify choices and their implications, then, as in a Rorschach inkblot test, employees (and especially salespeople and other customer-facing groups) will fill in the blanks with their own constructions. The result is diverse behaviors that fragment your resources and increase the risk of becoming a global mediocrity: a firm that’s good at many things but not great at any particular things, and that’s the surest way to dissipate any competitive advantage." 



LARS BASCHE – EMEA DIGITAL CONSULTANT
Lars is Digital Lead EMEA for Text100 based in Germany. As Digital Consultant, Lars assist clients entering and dealing with both the technologies and the communication challenges of the digital and social media world. In his current role, Lars leads an EMEA team of digital expert as well as a wider team of local digital leads in each of the 9 EMEA offices and he is responsible for the development of the Text100 digital services in EMEA.
LinkedIn: Lars Basche
Twitter: @larsbas

Kathryn Kure of ​Data Myna focuses on actionable insights relating to digital data, with a specific focus on web-site and social media analytics. 

About Data Myna

Over and above the obvious riff on 'data mining',  the Indian Myna is highly articulate, curious, adaptable, innovative and successfully out-competes other birds in its niche, which Data Myna argues is precisely what you want from your marketing intelligence.

Kathryn Kure - Data Myna 


Website: www.datamyna.com 

Thursday 7 May 2015

Tracking emails help sell, but privacy's lost


Kathryn Kure of Data Myna
In the marketing dust churned up by the disruptions of digital, particularly social media, some basic, and highly effective communication and sales channels have been ignored. 

The humble email, usually considered a customer retention exercise, is up to forty times more effective than both Twitter and Facebook combined when it comes to customer acquisition, according to data from the 2012 McKinsey iConsumer survey. McKinsey noted, however, that many companies were scaling back their email marketing in favour of social media; while the data told one story, the focus was on another, fed by the fear of missing out. But as Pew Research Centre stated in 2014:

"Despite a generation of threats and competitors, email ranks as the most important digital tool for workers who use the internet. Only 4% of these networked workers cite social media as very important on the job."

Part of the renewed interest in email comes from research initiated via Alexis C. Madrigal, which noted that a large amount of web traffic comes from sources such as  email, instant messages and forum posts. This traffic was recently quantified by Radium One  as representing over over two-thirds of sharing activity. Hence there is a renewed interest in such Dark Social activities, and email in particular.

Madrigal coined the term Dark Social to refer to the fact that web analytics were generally not tracking or not able to track such activities. However, companies, increasingly aware of the importance of email in the digital marketing mix, are now using tools provided by companies such as Yesware, Bananatag, and Streak - which has over 300 000 users - to track emails to consumers.

In fact, the concept is really simple. A lot of emails being send nowadays contain pictures that are not embedded in the email, but point to a URL, i.e., a web hyperlink,  as the source for the content. Since pictures are seen as mostly harmless content from a security perspective, most end users have their email clients set to automatically download external pictures, so that email signatures with company logos, for instance, display correctly. To download these pictures in the email, a GET request is sent to the server, asking it for the picture file to display in the email  Within this GET request, additional information can be embedded by the originator of the email, so when the server sees the GET request from the client, it can extract this additional information, store it in a server side database, and send the picture to the client (email viewer). Apart from this embedded information, the GET request will, by default, contain information on what application and/or operating system requested the picture, so the server can format it appropriately and the IP address of the requestor, so the server knows where to send the response to.

Together these bits of information can be used to track when, where based on the IP address and Global lookup databases for IP’s,  and on what device the email was opened and also how many times it was opened; furthermore, if you set a unique ID per customer, then this can be sent back and collated and used in your web analytics, by linking the email to which the original mail was send to the unique ID.

The actual image that is embedded is called a tracking pixel, which is simply code that asks for and returns a transparent 1x1 image, normally a GIF file, to keep the size very small. Now the call and return is not visible to the email recipient, in that the image is literally invisible.

That is, it acts just like a "read receipt" except that you may not be aware that you are actually sending back such a receipt. As people did when automatically switching off read receipts, so you can install software such as Ugly Mail which tells you when your emails are being tracked, or Pixel Block which blocks them altogether, but might block other useful pictures on the email as well. Of course, an even simpler way of not sending back information is simply never to download images, which then means the code cannot be activated. In fact, this is generally advocated given that while some uses may be benign, it can also be used for more malicious purposes, such as, for a would-be burglar, knowing when someone is not at home. 

As a marketer, such detailed information is powerful, however, even though email messaging should be opt-in, the fact that the pixel tracking is invisible to recipients cuts to the core of the issue of informed consent and it could be argued that consumers should given a chance to opt-out of such tracking. While tracking is technically easy, many email services advocate you do not automatically download images, so if some recipients refuse images but do read the email, they are not counted as having read the email, which means your web analytics have to be interpreted carefully, since while you can say those who sent the get request have read the email, you can’t say that those who did not did not read the email.

There is a fine balancing act between a company’s need for data and a consumer’s desire for privacy and, coupled with that, security. Companies in the US have reported a sharp increase in cyber-attacks, both in breadth and sophistication, and as cyber-security issues become more prominent, so will companies increasingly deactivate permissions by default and it will again become less and less easy to track consumers, particularly by location, unless there is explicit, informed consent.

An shortened version of this article first appeared on page 19 of the Saturday Star, 5 May 2015; this has been republished with permission. 
Saturday Star, 2 May 2015, page 19



Thursday 8 January 2015

Dark Social: Dimming the Lights on Web Traffic

Kathryn Kure of Data Myna
We are particularly good at lying, cheating and deceiving – even to ourselves. One of the most singular results from brain research indicates that, in the absence of other information, we tend to fill in the blanks by concocting coherent stories that we can believe. In split-brain studies, for instance, when a man's left hemisphere saw a chicken with a claw and his right a shovel and a snow scene, he chose the matches correctly, but then generated a story as to why he chose the shovel, claiming it was to clean out the chicken shed.

Copyright and Her Limits Go to the Creative Commons

Copyright and Her Limits Go to the Creative Commons A Play in Two Parts   by Kathryn Kure     This work is licensed under Attribution 4.0 I...