Week 49 — Lessons Learned from the Regulatory AI Projects — Part 3
How to measure readiness for AI
It’s -35 in Ottawa and I’m fresh off a long commute using what has become unreliable public transit. I could write an entire long winded rant on why the downfall of Ottawa’s public transit system is not good but I won’t today. Instead, I’ll add my voice to those who think the public transit system should be a free public service. It says a lot about our priorities as a city when the fine for parking on the street during a snowstorm is $75 but while the fine for fare evading is set at $150. In what world does it make sense to fine people nearly double for not paying their bus fare than creating a legitimate safety hazard by blocking the work of snow plows?
Over the past two weeks, I’ve shared some of the lessons learned that have emerged as part of the Regulatory AI projects. So far I have covered the scope and ambitions associated with the projects and what that means for future projects and last week I covered the need to be more agile especially as it relates to procurement. This week I am going to talk about the intersection between AI and policy but maybe not in the way you think.
With the Regulatory Evaluation Platform project, our aim was to bring natural language processing and machine learning forward as tools to support enhanced and comprehensive regulatory analysis. We would be able to analyze regulations to determine their prescriptiveness, similarity to other regulations, which were outdated and other related points of analysis. However, we ran into a number of challenges that couldn’t be solved through data or heuristics alone. Indeed, some of the problems we wanted to use machine learning to help solve and the results produced were fascinating for entertainment purposes but lacked a solid policy foundation. The problem was that there was no single source of truth and no objective reality. Instead, we were faced with a situation where the only truth was a subjective “it depends” which could be different depending on which regulator was doing the analysis. To provide a practical example, imagine you were asked to determine whether a regulation was “outdated” and a good candidate to be updated. Unfortunately, there isn’t an agreed upon definition of what “outdated regulation” means and how to determine whether a regulation is outdated. Any determination of whether a regulation is outdated is purely subjective and vary greatly depending on who is doing the analysis. The end result is inconsistent and unreliable data which is based on subjective analysis.
The challenge we faced with the Regulatory Evaluation Platform (and other AI projects) is a problem which finds itself wedged at the intersection of data and policy. On one hand, we don’t have a grounded policy foundation which establishes an objective truth about the measurement we want to do. It means that any calculation of this measurement is subjective and can be hard to compare at a system wide level. On the other hand, the lack of policy foundation means we can’t generate enough reliable data that can help us train a model and bring machine learning in to solve the problem. The issue at hand very well could be one where machine learning can help but if the people who are responsible for doing the work don’t agree on an objective truth about the definition of the data point then the analysis is not useful for decision making.
In practical terms, this means that anyone considering an AI, machine learning or natural language processing project must consider the intersection of data and policy to assess their readiness. Even if I have a lot of data I have to ask whether those responsible for data collection are working off the same policy definition, the same methodologies for collecting the data or in simpler terms — whether 1 granny smith apple is actually 1 granny smith apple or is it 1 orange or 1 tomato depending on who collected or generated the data.
AI Demonstrator Projects (Incorporation by Reference, Regulatory Evaluation Platform, Rules as Code)
Regulatory Evaluation Platform: We held our kick-off meeting for the next phase of this project. As discussed last week, we are hoping to do a deep dive into associating regulations to North American Classification Industry Codes (NACIS) so we can accurately track which regulations affect which industries. The other calculations in the platform will support deeper analysis through the industry code so we have an objective way to analyze the impact of regulations on specific industries. We are also hoping to visualize the entire value chain for a specific industry using this information. That way, a regulator can see the downstream impact regulations have on a particular industry and identify opportunities to streamline their regulations or work together with others to align regulatory objectives and regulations.
Incorporation by Reference: The final prototype for this project has been submitted. Next week, we will be running the prototype through a series of tests to compare how the model performs against human paralegals. Can the model pick up as many or more references than trained human paralegals? While we aren’t expecting 100% accuracy, we are hoping the tool can drastically cut the amount of time humans have to spend identifying references and collecting information such as language, cost and available of documents which have been incorporated by reference. Currently, we are estimated it takes a minimum of 1300+ hours/year to manually do this work (or 20 minutes per regulation) although the actual amount of time is likely higher considering the work has been ongoing for 2.5 years. The hope is that the tool (while not eliminating manual verification) can drastically cut the amount of time this task takes.
Rules as Code: We held our second workshop this week. We finished our concept model adding the relationships between the different concepts which we mapped in week 1. After finishing the concept model, we moved to building a decision diagram where we identified the key questions one would ask to determine a) if they are eligible for vacation pay and b) how much vacation pay they were entitled to. While the basic calculation was simple (when you were hired + years of service), the decision diagram became increasingly complex as we started to throw in questions which had nothing to do with the Canada Labour Standard Regulations like whether the employee was terminated, whether the employee took medical leave or whether the employee had already used none, some or all of their vacation pay prior to the moment they were checking what they were entitled to. An open question to those who are reading from the Rules as Code community: how far down these kinds of rabbit holes do you go? How deep should we explore these kinds of scenarios? The regulations simply state how to determine eligibility (e.g. employee) and how to calculate the rate of vacation pay based on hire date and years of service. The regulations do not account for the actual “money” calculation such as salary, use of leave etc. It simply provides a simple set of rules to determine the minimum entitlement but does not go as far as prescribing how the minimum changes because of all of the possible situations an employee finds themselves in. For those nerdy enough to care, here is our draft concept model. It is a work in progress so forgive the roughness and any spelling mistakes.
Future Demonstration Projects: I haven’t talked about future demonstration projects (e.g. next fiscal year). However, I am going to add a placeholder in the weeknotes and dive more into what we have planned when I write my week 50 weeknotes. We have a number of exciting projects in store and I can’t wait to share more about them.
Week 49 is over faster than I imagined, where does time go? Have a good week and I’ll see you next week.