Using the Power of Apache Spark(PySpark) to Verify Loads of Messages Imagine a scenario where you have a system costing you about £30K to run and maintain coupled with the fact that it is a necessary component in your company’s audit trail hence it cannot deprecated...
BigData
read more
Elasticsearch: Bulk ingest data
Often you think of a solution to a simple problem and once you come up with that solution you realise you need to apply this to a large dataset.  In this post, I will explain how I deployed a simple solution to a larger dataset while preparing the system for future...
BigData at the Commandline
BigData and Agile seem not to be friendly in the past but that is no more the case.  One of the important points in processes data is data integrity.  Assuming you are pulling data from an API(Application Programming Interface) and performing some processing on the...



