Earlier days we used to have three different teams – configuration management team to build the product, development that develops and hacks in software and test team that tests the system.Continue reading
( In this SmartBits, Arun Krishnan outlines “Challenges in testing big-data applications“. The video is at the end of this blog)
“I understand that big data is characterized by three V’s volume, velocity and variety with the data formats classified into different categories of structured, semi-structured data and unstructured data, and these are acquired from a variety of sources. What are some of the challenges or issues that these pose to validation?” Dr Krishnan’s answer to this question is below.
The Three V’s Volume, Velocity and Variety actually depends on who you are. There are 4V’s and 5V’s as well. Some of the definitions around Big Data are volume, variety, velocity but in a true sense all these are relative.
There are people who say 1TB of data and above is Big Data. The best definition as of today is one bit more data than your system can handle. If there is a system with 8 GB of memory and there are 8 GB and 1 bit of data if this can’t be loaded into memory that is the Big data and then it needs to be broken into chunks.
When we talk about Big Data, we need to understand that it is relative. What Big Data is to a retail chain where there is a point of sale data coming in every second or every minute need not be the same for a company which tests the software where the focus is on looking at test results coming in every few minutes or every hour.
HR data from a retail chain perspective is not big data, but from an HR perspective, yes they do, they have a variety of data sets coming in and they got to pull it all together The trick is in bringing all the data together and then get deeper into it. It’s not about the data quantity.
Data are in different forms like structured, semi-structured or unstructured. How we tie it all together and how we gain insights from them, is analytics. Another example is one of my students had been to an internship at an Indian public sector unit, and there he was asked which are the best colleges to hire from. This is a huge amount of data that one could gather. This student did something really simple and straightforward. He took the average scores for every College on performance and he took the average scores on that amount of time that college folks have spent in that organization. Plot the data with Y-axis as performance and X-axis the amount of time spent. Then arbitrarily, take two values one parallel to the x-axis one parallel to the y-axis. It suddenly has four quadrants and interestingly enough all the IIT’s came in the bottom quadrant, which is low retention and low performance. Very simple things can be done in analytics but the idea here is tying these two pieces of data together.
Even for testing it is important to figure out how we can tie real-time data coming in from devices, what we are getting in from server logs, as well as what Developers might be putting as comment, and then use that to infer what would be the issue and then build the test cases.
Analytics is a buzzword these days. But quite honestly, we have been doing analytics for a long time. If you think about it, the animals flight or fight response is analytics.Continue reading
Skills and competencies are most important. One needs to understand that it is not just the business or it’s not just the technology, it is both. One needs to acquire multiple skills.Continue reading
There is a change in the perspective of quality, measurements/KPIs for QA teams that deliver productised solutions to large enterprise customers.Continue reading
I think it is a given. You cannot validate if you do not how it is constructed and what it is doing.Continue reading
As a partner/solution provider, the first and foremost one is the needto have a partnership mind with the IT team of organization.Continue reading
It is a holy grail of software development. We have seen from CMM days that defect prevention has been the holy grail.Continue reading
Whether the end-user organization is small or large, the challenges remain the same for both of them.Continue reading
Raja Nagendra Kumar outlines the role of refactoring, unit testing in producing clean code. He states this very interestingly as “Technical debt is fat, clean code is liposuction” and crisply explains the act of producing clean code.Continue reading