As we stand on the threshold of the transformative application of AI across industry and society, this article illustrates how data is already the lifeblood of a wide variety of key verticals.
This is an extract from our Legal Aspects of Data white paper, the full document can be downloaded here.
The financial sector is one of the largest users of IT globally. Trading platforms – computer systems to buy and sell securities, derivatives and other financial instruments – are its beating heart. Based on an ecosystem of exchanges, index providers, data vendors and data users (asset managers on the buy-side and banks and brokers sell-side), these platforms generate market data, indexes, reference data and analytics and together form the world’s financial market data/analysis industry. Increasing regulatory requirements, technology developments (including the growing ability of AI to predict and interpret data), and market volatility have in recent years fuelled increasing demand for financial market data (where global spend reached $37.3bn in 2022).[1]
In legal terms this complex ecosystem is held in place by contract, with market practice based on agreement structures that license, restrict and allocate risk around data use. These contracts have grown up over the years and constitute a stable, cohesive, normative framework in markets that have seen little litigation. Exchanges and data vendors will seek to apply their standard terms, which are almost universally based on (i) the reservation to the data provider of all IP (copyright, confidentiality, trade secrets and (in the UK and EU) database right) in the data being supplied; and (ii) a limited licence to the customer to use the data for specified purposes. Points of contention in exchange, index and data vendor agreements typically centre on:
- scope of licence and redistribution rights (internal use only or onward supply, and increasingly data use for AI, machine learning (’ML’), service improvement and data science purposes);
- treatment of data derived from the data initially supplied (who owns it; what the user may do with it);
- use of the data after termination of the agreement; and
- scope of compliance audits and remedies for unpermissioned use and over deployment.
[1] TPICAP, 18 April 2023, Global spend on financial market data totals a record $37.3 Billion in 2022, rising 4.7% on demand for research, pricing, reference and portfolio management data – New Burton Taylor Report | TP ICAP.
In 2018, two important data-related developments took place in the UK banking industry. First, the UK implemented the second EU Payment Services Directive (‘PSD2’), enabling (among other things) banks and other payment account providers, their customers and third parties to share data securely with each other.[1] Second, in a sort of ‘own brand’ version of PSD2, the UK went live with its own Open Banking initiative, representing an important endorsement of Open Data principles. This mandated the nine largest UK banks to allow their personal and small business customers to share their account data securely and directly with third party providers regulated by the Financial Conduct Authority (‘FCA’) and enrolled in the Open Banking initiative. The Open Banking Ecosystem refers to all the components of Open Banking, including the Application Programming Interface (‘API’) standard and the security, processes and procedures, systems and governance to support participants in the initiative. As of spring 2023, there were 336 regulated providers and 7 million users in the Open Banking ecosystem.
The banking sector is consequently moving towards an increasingly standardised approach to IT around the structure and design of information architecture (‘IA’) in the shared trading, software, online and other information environments that characterise the banking world. For example, two industry standards bodies, the Open Group Architecture Framework (TOGAF) (which operates an open-standards based enterprise infrastructure architecture framework) and the Banking Industry Architecture Framework (BIAN) (which operates a banking specific standard IA based on service oriented architecture) have worked together to facilitate the development of standardised IA and accelerate the transformation that is under way in the sector.
Central to any IA and so to the collaboration between BIAN and TOGAF is data modelling, the analysis and design of the data in the information systems concerned. The IA’s formal structure and organisation of the database:
- starts with the flow of information in the “real world” (for example, orders for products placed by a customer on a supplier);
- takes the information through levels of increasing abstraction; and
- maps the information to a data model (a representation of that data and its flow categorised as entities, attributes and interrelationships). It does this in a way that all information systems conforming to the IA concerned can recognise and process.
Although this example is taken from the banking world, the underlying method and analysis of IA and data modelling apply generally across industry sectors and are central to solving the technical challenges of big data management projects.
[1] Directive (EU) 2015/2366 of 25 November 2016 on payment services in the internal market (and amending previous directives) – https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32015L2366&from=EN; PSD2 was implemented in the UK by the Payment Services Regulations 2017 (UK SI 2017/752) – http://www.legislation.gov.uk/uksi/2017/752/contents/made.
Insurance is based on the insured transferring the risk of a particular loss to the insurer by paying a premium in return for the insurer’s commitment to pay out if the loss occurs. The combination of large datasets, foundation models, and generative and other types of AI enables insurance risk to be assessed and predicted much more precisely than in the past by reference to specific data about the insured, the risk insured and other indicators. In turn, these factors enable the price of the policy to be calculated more accurately and claims assessed more quickly.
As well as the traditional ‘top down’ statistical and actuarial techniques of risk calibration and pricing, insurers can now rely on data relating to the insured person and insights delivered by AI. For example:
- in vehicle insurance, location based data from the driver’s mobile can show where they were at the time of the accident and other telematics data from on-board IT can show how safely they were driving;
- in home insurance, smart domestic sensors help improve responsiveness to the risk of fire, flooding or theft at home; and
- in health and life insurance, health apps and wearables provide relevant information.
Comparing this specific data with insights gleaned from AI/ML algorithms trained on vast foundation models enables further accuracy in calibration.
AI and analytics in insurance also point up a number of common themes. First, the tension between the privacy of the insured’s personal data and its availability to others – a tension that insurers wrestle with in the context of genetic pre-disposition to illness and the socialisation of risk. Second, bias in AI models may lead to discrimination in outcomes. Third, as in the banking sector, increasing regulatory scrutiny is accentuating the importance of data analytics.
The ATI has grown up with computerisation, data and standardisation as key components in getting passengers and their baggage to the airport of departure, on to the plane, and to and from the airport of arrival.
Airlines manage their inventory (seats) and sell seat tickets to passengers:
- directly via their own websites;
- indirectly via ‘two step’ distribution – through global distribution systems (‘GDSs’) operated by the likes of Amadeus, Sabre and Travelport and then through travel agents; and
- indirectly via ‘one step’ distribution – through travel agents, either bypassing the GDS altogether or passing only tangentially through it.
Airlines pay the GDS a booking fee for their seats booked through that GDS. The GDS also pays commission to travel agents, who frequently engage with a single GDS provider in return for higher commissions. This also means that airlines in practice may need to present their content on all GDSs so as to ensure full coverage.
As large IT systems, GDSs are central to airline distribution in the way they aggregate data (‘content’) about airlines’ inventory from each airline’s passenger service system and present it to travel agents searching for seats for passengers.
[1] The recorded music industry is a $26bn global business in full digital transformation as streaming has come to dominate music consumption, accounting for over 80% of the industry’s revenues. The structure of the industry has grown up around norms based on the individual and collective management and licensing of the various and distinct copyrights that arise in a song’s composition, lyrics and publication, and in its recording and performance. These copyright norms operate primarily on a national basis with harmonisation and equivalence established internationally through copyright treaties like the Berne Convention and WIPO Treaties.
The big three record companies (Universal, Sony and Warner) together account for around 70% of the global recorded music market. The track is the product unit for the sector and PPL, the UK CMO (Collective Management Organisation (‘CMO’)) for the public performance rights of its 150,000 recording and performer members, operates a repertoire database of more than 20 million tracks that is growing by 50,000 new recordings per week. The digital representation of the track is the ‘line of rights data’, and more than 200 million lines of rights data are provided to PPL each year. Management of data is a large part of PPL’s work, driving more accurate distributions, business services to other CMOs and better international collections.
The record industry is another sector where AI and data techniques are enabling new music content to be generated as well as rapid insights into consumer preferences. These insights have historically been the province of record company A&R (artists and repertoire) teams but data is increasingly influencing musical taste, fashion, trends and therefore the creation of music itself in a way that has not been possible before.
[1] Statistics in this paragraph include those from the IFPI GLOBAL MUSIC REPORT 2023 and PPL Annual Report.
Healthcare remains the sector where data use will have the greatest impact on people’s daily lives. Four drivers lie behind UK healthcare data innovation:
- intensifying cost pressures leading to demands for better data;
- increasing availability of national collections of clinical and treatment outcome datasets;
- growing investment in anonymising, aggregating and analysing data from individual care centres; and
- government support of AI, open data and interoperability standards.
Day to day public spending on healthcare in the UK (principally the NHS) is forecast to be around £175bn for the 2023/24 financial year, accounting for roughly 20% of total UK day to day public spending of £930bn.
Following an independent review,[1] NHS Digital (the national provider of information, data and IT systems for healthcare commissioners, analysts and clinicians) and Health Education England merged with NHS leader NHS England in the first part of 2023. In the words of the review:
“Digital technology is transforming every industry including healthcare. Digital and data have been used to redesign services, raising citizen expectations about self-service, personalisation, and convenience, and increasing workforce productivity. The pandemic has accelerated the shift to online and changed patient expectations and clinical willingness to adopt new ways of working. In addition, it facilitated new collaborations both in the centre of the NHS and wider local health and care systems. Together, these changes have enabled previously unimaginable progress in digitally enabled care pathways.”
[1] NHS England » Health Education England, NHS Digital and NHS England have merged into a single organisation
By the UK Education Act 1996, the Secretary of State is empowered to require every maintained (publicly funded) school to record and keep large amounts of data relating to students, teachers and parents. The data to be kept includes personal details relating to students and staff, and educational records relating to attendance, attainment, school curriculum, exams and educational provision. These record keeping requirements for the UK’s 30,000 schools have led to the development of the UK’s education technology sector over the last forty or so years, where each school keeps the required data and records in its management information system (‘MIS’), which connects with complementary software applications that read the data in the MIS to perform additional functions like payment, engagement, accounting and messaging.
As in all developed states, HMG’s database about its citizens is the largest in the country, and government departments like Business and Trade, Education, Health and Social Care, Home Office, Revenue and Customs, Science, Innovation and Technology and Work and Pensions have huge and growing databases. As individual government departments increasingly master their own digital data and central government as a whole intensifies data sharing and use of AI, HMG’s data estate is now recognised as a valuable national asset. Looked at as an asset, managing the UK’s data estate raises complex policy questions as to protection, growth, maintenance and monetisation along with the reconciliation of competing interests, including protection of privacy and other individual liberties, the security of the State and its citizens, crime and fraud prevention, commercial interests, safeguards against State overreach and maximising the benefits of technological progress for citizens.
The UK’s National Data Strategy policy paper sets out the broad opportunity for data in the UK public sector:
“We are currently in the middle of a fourth industrial revolution. Technological innovation has transformed our lives, changing the way we live, work and play. At the same time, this innovation has brought with it an exponential growth in data: in its generation and use, and in the world’s increasing reliance upon it. By embracing data and the benefits its use brings, the UK now faces tangible opportunities to improve our society and grow our economy. If we get this right, data and data-driven technologies like AI could provide a significant boost to the entire economy. Data can improve productivity and provide better-quality jobs. But it can also transform our public services and dramatically improve health outcomes nationally. It can keep us safe and assist the reduction of crime, speed the journey to decarbonisation, and, used well, drive efforts to create a more inclusive, less biased society.”[1]
[1] National Data Strategy – GOV.UK (www.gov.uk), 8 December 2020, point 2, the data opportunity