Systemic Bias in Algorithms as well as Data

InsightsPortfolio

Apr 20

By David Magerman PhD

Facebook recently announced a new program called Casual Conversations, which offers an open source data set of over 40,000 unscripted videos from a diverse set of Facebook users. This announcement is significant, because it represents an acknowledgement of the reality that the vast majority of data sets available to AI researchers for facial recognition and other video processing algorithms underrepresent women, people of color, and other minorities. The data set hopes to right the wrong perpetrated against these group by the introduction and use of AI systems that are far less accurate for non-white non-males and which draw less accurate conclusions about women and people of color.

This program is a welcome step in reversing the damage done to the reputation of AI systems, but, unfortunately, it is only addressing one piece of the problem. Diversified training data is an important part of ML-based AI systems, but it is only a piece of the puzzle. Training models on biased data is likely to lead to biased predictions and biased behaviors. However, training models on unbiased and diversified data may not lead to much better results, in terms of biased performance. The reason has to do with how ML-based algorithms are developed.

If machine learning was simply a matter of math, there would be no issue. Parts of the algorithm are just math. When you are inverting a matrix or maximizing or minimizing an algebraic function, it doesn’t matter what the training data looks like. The mathematical algorithms for solving those problems are just math. There are lots of ways of implementing those algorithms, but they will all give you the same answer.

However, machine learning is not just math. It’s also, in fact mostly, heuristics. Most of the optimization problems being solved in machine learning algorithms are intractable and their solutions can only be approximated. And artificial intelligence systems combine different machine learning processes and models, combined with implementations of (sometimes biased) real-world assumptions, knowledge, and reasoning. AI research involves iterating on initial training and test sets to test different combinations of these components, with different decisions made along the way, to yield a software system that can be applied to larger data sets to train the overall system to perform some real-world task.

Once that process is completed, the structure of the machine learning system is largely written in stone. Applications of that system to new data sets may yield different results, but the biases that were encoded in the training and test sets that were used for the ML and AI software development will carry through to all future uses of those systems.

A great example of this phenomenon in action is in human resources technology. Over the past few years, software companies have developed different AI-driven algorithms for solving problems in HR tech: triaging resumes, rating interview videos, evaluating candidate fit for teams, managing professional development, etc. In the past year, it has been discovered that many of these systems are biased in the ways discussed above, largely because of biases in the data these algorithms are trained on. The problem is, retraining these same algorithms on diversified data hasn’t led to much better performance in terms of bias. The reason, likely, is that the choices made in the design of these software systems encoded the bias of the original data sets in the training algorithms and in the ways the trained models are deployed to make decisions. The only way to truly fix those systems and remove their performance biases would be to wipe the slate clean and rebuild the algorithms from scratch. Simply updating model coefficients and weights using new data won’t undo the systemic bias in the software itself.

In contrast to that, consider one of our portfolio companies, Knockri. Knockri was founded by three co-founders who experienced racism and bias in their job searches. They set out to build an AI-driven system for automating the interview process that was designed to be unbiased. They started with diversified training data that contained objective information that represented the universe of job-relevant behaviors and supplemented it with colloquial behavioral statements taken from a diverse set of applicants. Once they had this diverse data set, only then did they start making the heuristic design decisions that hard-coded the characteristics of their initial training and test data into their algorithms. As a result, Knockri’s human resources solution performs exceptionally well on evaluations that measure bias in real-world applications. This solution shows the importance of understanding how bias baked into training data sets can influence the training process.

Facebook’s Causal Conversations data set is a huge step in helping AI researchers build unbiased, fair software systems to solve real-world problems that don’t disenfranchise large swaths of society. However, it isn’t enough. AI researchers and engineers need to go back to first principles, wipe the slate clean, and build new algorithms developed without the intuitions gleaned from the past decade of machine learning and AI research. In fact, the best thing we can do is find brilliant engineers and scientists who don’t know anything about how to do AI and ML on human behavioral data and throw them at the problem. Without the knowledge of the biased decisions their predecessors made, they can go where the unbiased data takes them and build much fairer algorithms.

Until the AI/ML community takes this radical step, we will be hamstrung by the design decisions we made on biased data, and we will continue to produce flawed software that reinforces the systemic racism that brought us to where we are now.

More News & Insights

News & Insights

Mar 6, 2025

On the Record with Lizzy Kolar, Scope Zero Co-Founder & CEO

Mar 6, 2025

Differential goes on the record with Lizzy Kolar, the co-founder and CEO of Scope Zero. Scope Zero's mission is to reduce annual utility bills and fuel expenses by $300 billion, the environmental equivalent of removing 125M cars from the road.

Mar 6, 2025

Feb 11, 2025

DataRobot Acquires Agnostiq

Feb 11, 2025

AttackIQ Acquires DeepSurface

Feb 11, 2025

Aug 27, 2024

On the Record with Moshe Hecht Hatch.AI Founder & CEO

Aug 27, 2024

Differential goes on the record with Moshe Hecht, an award-winning philanthropic futurist and innovator, reshaping the world of giving through technology and data solutions. The founder and CEO of Hatch, he is a dedicated philanthropist and has been published in Forbes, Guidestar, and Nonprofit Pro.

Aug 27, 2024

Aug 18, 2024

Driving Sustainability through Employee Benefits with Scope Zero CEO Lizzy Kolar

Aug 18, 2024

The WorkplaceTech Spotlight host Hadeel Al-Tashi sits down with Lizzy Kolar, Co-Founder and CEO of Scope Zero to dive into how Scope Zero's Carbon Savings Account (CSA) empowers employees to make affordable home technology and transportation upgrades while aligning with corporate sustainability goals. They discuss how the CSA not only supports environmental and financial wellness for employees but also strengthens a company's commitment to sustainability. Don't miss this opportunity to learn how integrating green benefits can drive meaningful impact within your organization.

Aug 18, 2024

Aug 6, 2024

Hatch. AI Closes a $3 Million Seed Round

Aug 6, 2024

Hatch AI, a groundbreaking intelligence platform for nonprofits, announced a $3 million raise in seed funding, led by Differential. Read the full press announcement at the link below.

Aug 6, 2024

Mar 4, 2024

Pienso: Putting AI into the hands of people with problems to solve

Mar 4, 2024

MIT News: Alumni-founded Pienso has developed a user-friendly AI builder so domain experts can build solutions without writing any code.

Mar 4, 2024

Feb 26, 2024

On the Record with Nate Cavanaugh, CoFounder & Co-CEO of FlowFi

Feb 26, 2024

On the Record with Nate Cavanaugh, CoFounder & Co-CEO of FlowFi.

In 2021, Nate co-founded of FlowFi, a SaaS-enabled marketplace that connects startups and SMBs with finance experts. FlowFi has raised $10M from top VC firms including Blumberg Capital, Differential Ventures, Clocktower Ventures and Precursor Ventures, and generated 7-figures of annual recurring revenue in its first year.

Nate was nominated to the Forbes 30 Under 30 list for Enterprise Technology.

Feb 26, 2024

Feb 13, 2024

FlowFi Closes on $9M in Seed Funding

Feb 13, 2024

TECHCRUNCH: FlowFi, a startup creating a marketplace of finance experts for entrepreneurs, closed on $9 million in seed funding.

Blumberg Capital led the investment and was joined by a group of investors including Parade Ventures, Differential Ventures, Precursor Ventures, Special Ventures, 14 Peaks Capital and Cooley LLP.

Feb 13, 2024

Dec 13, 2023

Cyolo’s Almog Apirion on Nasdaq TradeTalks

Dec 13, 2023

NASDAQ: Nasdaq TradeTalks: 2024 Cybersecurity Budget Outlook with Almog Apirion, Cyolo.

Dec 13, 2023

Nov 30, 2023

Retrocausal Raises $5.3M in Financing

Nov 30, 2023

FINSMES: Retrocausal, a Seattle, WA-based platform provider for manufacturing process management, raised $5.3M in funding.

The round was led by Glasswing Ventures, One Way Ventures, and Indicator Ventures, with participation from existing investors Argon Ventures, Differential Ventures, Ascend Vietnam Ventures, Incubate Fund US, SaaS Ventures, Hypertherm Ventures, Stage Venture Partners, and Techstars.

Nov 30, 2023

Sep 19, 2023

Nick Adams Discusses How To Get Your Generative AI Startup Funded

Sep 19, 2023

AI and the Future of Work Podcast: Entrepreneurs wonder what it’s like to be a VC. And VCs without an operating background often don’t understand the grit required to turn an idea into a successful business. The best investors have been successful operators first.

Today’s guest is one of those. Nick Adams founded Differential Ventures in 2017 to invest in B2B, data-first seed-stage companies. Since then, Nick and the team have invested in an impressive group of companies including Private AI, Ocrolus, and Agnostiq.

Sep 19, 2023

Aug 8, 2023

On the Record with Elissa Ross, CoFounder & CEO of Metafold

Aug 8, 2023

On the Record with Elissa Ross, CoFounder & CEO of Metafold. Elissa Ross is a mathematician and the CEO of Toronto-based startup Metafold 3D. Metafold makes an engineering design platform for additive manufacturing, with an emphasis on supporting engineers using metamaterials, lattices and microstructures at industrial scales. Elissa holds a PhD in discrete geometry (2011), and worked as an industrial geometry consultant for the 8 years prior to cofounding Metafold. Metafold is the result of observations made in the consulting context about the challenges and opportunities of 3D printing.

Aug 8, 2023

Jul 26, 2023

Nick Adams: What Regulations Need to Be Put in Place to Ensure the Safe Use of AI in the U.S.?

Jul 26, 2023

Nick Adams on PM360: To get a better grasp on what eventual AI regulations could and should look like, PM360 spoke with Nick Adams, Founding Partner at Differential Ventures. In addition to starting the venture capital firm focused on AI/machine learning in 2018, Adams is also a member of the cybersecurity and national security subcommittee for the National Venture Capital Association and recently briefed members of Congress on AI policy and potential regulation.

Jul 26, 2023

Jul 18, 2023

Metafold 3D Closes $2.35 Million CAD To Fuel Industrial Adoption of 3D Printing

Jul 18, 2023

BETAKIT: Metafold 3D, which wants to make it easier for manufacturers to design and 3D print complex parts, has secured $2.35 million CAD ($1.78 million USD) in seed funding.

Toronto-based Metafold was founded in 2020 by a group of math, geometry, and architecture experts in CEO Elissa Ross, CTO Daniel Hambleton, and COO Tom Reslinski. Born out of Hambleton’s geometry-focused consulting agency, Mesh Consultants, Metafold sells design for additive-manufacturing software to sportswear and biopharmaceutical companies.

Jul 18, 2023

Jul 12, 2023

Nick Adams: Where’s AI headed in the workplace? VCs weigh in

Jul 12, 2023

Nick Adams on TECHBREW: For all the pixels spilled about the promises of generative AI, it’s starting to feel like we’re telling the same story over and over again. AI is serviceable at document summarization and shows promise in customer service applications. But it generates fictions (the industry prefers the euphemistic and anthropomorphizing term “hallucinates”) and is limited by the data on which it’s trained.

Jul 12, 2023

Jul 10, 2023

Mona Introduces Free, Self-Service Monitoring for GPT Applications

Jul 10, 2023

ATLANTA and TEL AVIV, Israel, June 29, 2023 /PRNewswire/ -- Mona, the leading intelligent monitoring platform, unveils a new monitoring solution for GPT-based applications. The free, self-service offering provides businesses with granular visibility into GPT-based products and valuable insights into costs, performance, and quality.

Jul 10, 2023

Jun 21, 2023

David Magerman: Artificial Intelligence’s Glass Ceiling

Jun 21, 2023

David Magerman on THEINFORMATION: OpenAI’s stated goal is to develop and promote a software system capable of artificial general intelligence. Toward that end, the company has released systems based on large-language models, which can respond to prompts with fluent conversation on many subjects. ChatGPT, Microsoft’s Bing chatbot and other new systems based on OpenAI’s GPT-3 and GPT-4 models are truly incredible and perform far beyond previous attempts at achieving AGI.

Jun 21, 2023

Jun 16, 2023

Morgan Stanley at Work Launches Carver Edison’s Cashless Participation®

Jun 16, 2023

BUSINESSWIRE: Morgan Stanley at Work and Carver Edison, a financial technology company, announced today that Shareworks has joined Equity Edge Online® in offering Cashless Participation® to U.S.-based corporate clients. Since the initial launch of Cashless Participation® on Equity Edge Online®, stock plan participants have purchased more than one million shares1 with Cashless Participation®. Now that Shareworks has also launched the tool, a wider cohort of Morgan Stanley at Work corporate clients will have access.

Jun 16, 2023

Jun 9, 2023

Nick Adams on Fox5: Artificial Intelligence Pros and Cons

Jun 9, 2023

FOX5 WASHINGTON DC: Nick Adams discusses the pros and cons of Artificial intelligence.

Jun 9, 2023

May 15, 2023

Differential Ventures Specializes In Being Advisors For AI Companies

May 15, 2023

PULSE 2.0: Differential Ventures is a seed-stage venture capital fund that was founded by data scientists and entrepreneurs for data-focused entrepreneurs. To learn more about the firm, Pulse 2.0 interviewed Differential Ventures’ managing partner and co-founder Nick Adams.

May 15, 2023

May 4, 2023

Golioth Secures $4.6M Seed Funding to Accelerate Time-to-Market for IoT

May 4, 2023

IoTForAll: Golioth, a leading developer platform for the Industrial Internet of Things (IIoT), announced open access to a library of new reference designs for embedded engineers to accelerate their time to market, the launch of a Select Partner Program for energy and construction developers, and the completion of a $4.6M round of seed funding led by Blackhorn Ventures and Differential Ventures with participation from existing investors, Zetta Venture Partners, MongoDB Ventures and Lorimer Ventures.

May 4, 2023

May 1, 2023

PrivateAI’s PrivateGPT aims to combat ChatGPT privacy concerns

May 1, 2023

VENTURE BEAT: Data privacy provider Private AI, announced the launch of PrivateGPT, a “privacy layer” for large language models (LLMs) such as OpenAI’s ChatGPT. The new tool is designed to automatically redact sensitive information and personally identifiable information (PII) from user prompts.

May 1, 2023

Apr 28, 2023

Quantum commercialization: softly, softly towards the inevitable future

Apr 28, 2023

DIGINOMICA: What can an early-stage investor tell enterprises about the nascent quantum market?

The quantum tipping point – that fabled moment when quantum technologies break through to commercial adoption at scale – has been questioned in a previous diginomica report…

Apr 28, 2023

Apr 27, 2023

Why Investors Bank on Quantum Commercialization

Apr 27, 2023

ENTER QUANTUM: Experts agree that commercial quantum computing at scale could be as much as 10 years away, but this hasn’t stopped investors from betting on it turning a profit in the near future. U.S. tech venture capital company Differential Ventures led the recent $6 million seed extension round for quantum software company Agnostiq which it will use to accelerate further development and commercialization of its enterprise-grade quantum and high-performance computing platform Covalent.

In this Q&A, Differential founding partner David Magerman explains why investors are throwing their weight behind commercial quantum now.

Apr 27, 2023

Apr 25, 2023

Banking in Venture Capital & the Tech Industry

Apr 25, 2023

On Tuesday, April 25th, 2023, Differential Ventures hosted a webinar on “Banking in Venture Capital & the Tech Industry”. The panel was moderated by David Magerman, Managing Partner of Differential Ventures, and joined by guest speakers Michael Crook (Chief Investment Officer, Mill Creek Capital Advisers), Samir Kaji (CEO & Cofounder, Allocate), and Matt Streisfeld (General Partner, Oak HC/FT).

Apr 25, 2023

Apr 14, 2023

Differential Ventures partners with Betaworks for AICamp: Augment

Apr 14, 2023

AICamp: Augment is a 3 month long accelerator program run by Betaworks, aimed at bringing together the most creative pre-seed & seed stage companies building software powered by AI to augment human activity.

Apr 14, 2023

Apr 10, 2023

Agnostiq Closes $6.1M Seed Extension Round

Apr 10, 2023

Quantum computing startup Agnostiq Inc. said today, April 5, 2023, it has closed on a seed funding round worth $6.1 million to help accelerate the development of its enterprise-grade quantum and high-performance computing platform.

Apr 10, 2023

Nick Adams: What to Do When Your Balance Sheet Doesn’t

Apr 10, 2023

Sand Hill Road Podcast: Nick Adams joined the Sand Hill Road podcast to discuss the way startups can survive a downturn.

Apr 10, 2023

Mar 31, 2023

David Magerman: Why Sophistication Will Win Out In The Machine Learning Ops Sector

Mar 31, 2023

UniteAI: There’s no question that machine learning operations (MLOps) is a burgeoning sector. The market is projected to reach $700 million by 2025 – almost four times what it was in 2020.

Still, while technically sound and powerful, these solutions haven’t generated the expected revenue, which has raised concerns about future growth.

Mar 31, 2023

AllDifferential Ventures

Guest User

Systemic Bias in Algorithms as well as Data

More News & Insights

Contact Us

Learn more

Systemic Bias in Algorithms as well as Data

More News & Insights

On the Record with Patricia Thaine, CEO at Private AI

On the Record with Sharon Zhang, CTO at Human AI Labs

Contact Us

Learn more

Sign Up