CV_MarApr_25

with both internal and external stakeholders. All types of business analysis – descriptive, diagnostic, predictive and prescriptive – ultimately require clean, accurate and high-quality data. But raw data is similar to crude oil. It’s messy, unstructured and unrefined. Before businesses can use it, they must have a system in place to process, analyze and store it. Otherwise, data is more of a liability than a strategic asset. And this is where data integration and pipelining come into play. Integration Overview Companies today pull in vast amounts of data from various sources such as IoT sensors, websites, social media, spreadsheets and applications. The challenge is that these sources are often siloed. In addition, raw data has varying levels of quality and consistency. Before data can be used, it must be cleaned and transformed into a unified format. To accomplish this, companies need to move data from source locations to staging areas and warehouses or lakes. This process is called data pipelining. There are several types of data pipelines to consider. For example, companies may use batch pipelines to process data at intervals or streaming pipelines for real-time data flows. Various strategies exist which involve extracting, transforming and loading data into storage for easy access. Some examples include extract, transform, load (ETL); extract, load, transform (ELT); and extract, transform, load, transform (ETLT). By implementing modern data pipelines, companies can ensure they are giving different AI and machine learning algorithms and data visualization tools clean, accurate and reliable information – resulting in stronger outputs with fewer errors. “Companies have realized that AI models are only as good as the data feeding them, yet most struggle with fragmented, inconsistent data trapped in silos,” Singh said. “This creates a massive opportunity for channel partners to solve the ‘first-mile’ data problem – helping businesses transform raw, messy data from customers and partners into structured, usable information.” Ideally, every company should have comprehensive pipelines for moving data from ingestion to production. But for many companies, their environments look nothing like this. All too often, information flows into the business and winds up stagnating, which leads to missed opportunities. What’s interesting is that businesses will go to great lengths to protect their data or recover it from ransomware attacks – especially mission-critical data. At the same time, companies often neglect data processing and integration which leads to stagnation and poor data ROI. This limits a company’s ability to unlock value and drive growth. In addition, companies often rely on bad data to make decisions. This can be especially risky when using AI systems. For example, Harvard Business School found that poor input management and flawed training data can negatively impact AI’s output. In a study, researchers analyzed work schedules for thousands of retail employees during a five-year span. They discovered that 7.8 million shifts (7.9 percent of the total) required manual modifications resulting from incorrect information that was provided to the AI model. “If you put in garbage, the AI tool – no matter how sophisticated it is or how complex it is or how much data you feed it – will produce something that’s suboptimal,” explained Caleb Kwon, a doctoral student who led the study. “And that’s exactly what we found: the schedules generated by this AI tool do not reflect the reality of what employees can and can’t do. The generated work schedules were effectively useless.” The study indicates that companies typically experience the strongest benefits when they have “strong, principled controls in how AI tools are set up and managed before deployment, rather than treating them as autonomous solutions.” Source: 64 Squares 14 CHANNELVISION | MARCH - APRIL 2025 Source: Datamation

Made with FlippingBook

RkJQdWJsaXNoZXIy NTg4Njc=