IT 233: Business Information Systems
By the end of this chapter, you will be able to:
Big Data refers to vast, complex datasets that are too large to be managed or analyzed using traditional data processing tools.
The challenge isn't just storage. It's about...
Click each scenario card to reveal whether it requires Big Data technologies or not.
Scenario A: A streaming service tracks 80 million users playing songs simultaneously and updates playlists in real time.
Scenario B: A local pharmacy records 150 prescriptions per day in a spreadsheet.
Scenario C: Smart traffic cameras across Kathmandu stream HD video 24/7 for city-wide congestion analysis.
Scenario D: A school exports quarterly exam results from its student information system.
Click each card to reveal the answer.
Big Data is commonly defined by five key characteristics:
The scale of data
The speed of data
The different forms of data
The trustworthiness of data
The business outcome from data
Read the scenario then select which V of Big Data it best illustrates.
Refers to the sheer quantity of data being generated and stored.
We've moved beyond Gigabytes (GB) and Terabytes (TB) to...
Example: Facebook stores hundreds of petabytes of user photos and videos. The Large Hadron Collider generates ~1 petabyte of data per second.
The speed at which new data is created and must be processed.
Often, insights are needed in real-time or near-real-time to be useful.
Example: Real-time stock market analysis, live social media trend monitoring, or data from IoT sensors on a factory floor require immediate processing.
Refers to the different forms that data can take. Big Data is rarely neat and tidy.
Highly organized, like a spreadsheet or SQL database.
No predefined format, like text, images, or video.
Has tags/markers, like XML or JSON files.
The vast majority of Big Data is unstructured.
Click each item to cycle through data types, then press Check to score yourself.
Refers to the trustworthiness, accuracy, and quality of the data.
With data from so many sources, uncertainty and "noise" are major challenges.
Example: Analyzing social media sentiment is difficult due to sarcasm, slang, and fake accounts. This affects the data's veracity and can lead to wrong conclusions.
Click each data source to label it β then Check to see your score.
Arguably the most important V. Does the data lead to a tangible business outcome?
If you cannot turn your data into value, it's not an assetβit's a costly storage problem.
The goal is to derive insights that lead to:
Click the best technology for each Big Data use case, then Check.
Analyzes massive volumes of viewing data (what you watch, when you pause, what you search for) to power its recommendation engine and decide which new shows to produce.
The Nepal Tourism Board could analyze unstructured data from social media (Instagram geotags, travel blogs, TripAdvisor reviews) to identify emerging tourist destinations, understand visitor sentiment, and plan marketing campaigns more effectively.
Let's discuss the chapter questions.