top of page
Search

How is AI going to affect my job?

  • Writer: Sixtine Vervial
    Sixtine Vervial
  • Mar 5
  • 13 min read

Updated: Mar 7


Last night, my wonderful dad asked me a very important question. It came from a place of great knowledge of the tech and the consulting industries, concern for the future savings of his daughter, and care her professional development. "Have you ever thought of how artificial intelligence is going to affect your job?" he asked. "but dad, artificial intelligence IS my job", I answered, with the great defiance he knows me capable of. He made such a great point though, that once my ego had gone back minding his own business, and dreams mixed up the associated fears with the events of the day, I couldn't help but taking the time to write some thoughts down.


Before heading to the exact impact of AI on my daily work and career, let's review the semantics. My Master’s degree has the title “Data Science” on it, as they just had changed it from “Decision Support Engineering”. What I studied to become a Data Scientist is mainly mathematics, because it appears that information is ruled by a lot of equations, and becoming a data scientist is learning how to maximise the insights (derived information) that we extract from data points (raw information). A quick look at Google Trends will show you how the media picked up on several buzz words to describe this activity : Big Data (2011), Data Science (2013), Machine Learning (2014) and finally Generative AI in 2023 are just neighbours in the same information technology community. The general idea behind all those techniques is to have access to more and more powerful machines that help us solve the equation at higher scale.


During the first years of his career, my dad used to send his computations to a data center overseas and go flirt with my mom in the street of Paris while wait for the results. When I studied, a short cigarette break was enough to get me the same results. Now ten years into the business, our phones listen to us and present us with some recommendations before we even get a chance to ask the question. What is the future of information business, you ask? Are algorithms going to take over the world? In this article, I will answer my dad’s question by first acknowledging how AI changed my everyday life and tasks, then dive into how it changes the way I code, and finally why I do not believe that we’ll be ruled by the machine anytime soon.



Table of content


  1. AI helps in my daily life

  2. AI transforms the way I code

    1. Discovery / insights : data analysis

    2. ETL / data pipelines : data engineering

    3. Prediction / recommendation : data science

  3. AI allows me to focus on the human part of my job

    1. Because most humans aren't trained to use AI

    2. Because applied AI can’t be a black box

    3. Because high-tech requires well designed interfaces

    4. Because AI is expensive

    5. Because AI needs maintenance and regulations

    6. Because AI requires us to change, and we hate change





  1. AI helps in my daily life


I am beyond thankful for the integration of AI components in our everyday organisation and communication tools we use in the startup life. Yes, I do have to agree to share some personal information with the “big bad GAFA”, but I do consider the fact that the time saved with those tools helps me focus on the greater picture. Here are a few examples of how I use AI to ease my daily tasks. Because I am data-driven by construction, I will attempt to estimate the time saved along the way.


  • Google Mail: to help me draft easy answers to emails

    • time saved: 15 minutes daily

    • greater impact: reach inbox-zero every week instead of once every quarter

  • Notion: to help me structure blank pages

    • time saved: 30 minutes for each new task / project

    • greater impact: lower the fear of the blank page, kickoff tasks from scratch

  • ChatGPT: to help with travel logistics

    • time saved: 45 minutes per travel booked

    • greater impact: lower travel anxiety, save money on luggage fees, optimize international/duty-free shopping

  • ChatGPT/Notion: to rephrase, shorten, review

    • time saved: ~ 1 hour per week

    • greater impact: learning new idioms, more efficient and appropriate communication

  • Google photos: to search through photos with keywords

    • time saved: 1 hour per week

    • greater impact: avoid infinite scrolling and reduce screen time


All together, it's at least one surf or two yoga sessions that AI gifts me every week.


Overall, I wouldn't say that I fully rely on AI and would be lost without it, but I am much obliged to the technical teams that embed components into our tools for improving my emails, messages, and technical documents. While I believe AI has an enormous potential, I am grateful for the specific applications that appeared in our favorite apps, enhancing the user experience and thus helping us achieve our objectives faster, ultimately leaving our screens behind much faster to focus on human relations.


  1. AI transforms the way I code


Even though most of my time working is spent thinking, discussing, understanding, interviewing, reading data and brainstorming action items with my stakeholders, my job as a Data Solution Architect involves several phases where writing code is necessary. From data discovery, to engineering, into data science, each step is nowadays accompanied by artificial intelligence which helps me to render a more complete task, more efficiently.


  • Discovery / insights : data analysis


As we kickoff a project, we start with a round of presentation of the business objectives that motivates the resources involved into creating a new data product. From a simple dashboard to a complex recommendation engine or the integration of an AI component into the product, we cannot bypass the data discovery phase. That stage lists all the relevant data assets that will be involved in the machine decision process, its sources, and runs a meticulous analysis of the datasets involved. This process can be redundant from one project to the next, the code written (often in a Jupyter notebook, with dataframe exploration) extremely repetitive. At this stage, I often give a 5-lines sample of my dataset to ChatGPT and ask it to produce a long list of graphs to get a good ideas of how versatile my data is. Below is an example of a function that can do the job in Python. I use AI to produce a long baseline of code and then sort dimensions, arrange the graphs for better visibility, note down some first insights (on missing values, distributions) and write the first conclusions on data usability for the project.


%python

def analyze_dimension(df, dimension):
    # 1. Percentage of nulls per group
    null_percentage = df.groupby(dimension).apply(lambda x: x.isnull().mean() * 100)

    # 2. Describe numerical features by group
    description = df.groupby(dimension).describe()

    # 3. Visualize distributions
    numeric_cols = df.select_dtypes(include=['number']).columns
    for col in numeric_cols:
        plt.figure(figsize=(8, 5))
        sns.histplot(df[col], bins=30, kde=True)
        plt.title(f"Histogram of {col}")
        plt.xlabel(col)
        plt.ylabel("Count")
        plt.show()

    return null_percentage, description

# Example usage
# nulls, desc = analyze_dimension(df, 'your_column_name')

Note that typically no product manager or stakeholder will enjoy reading through my Jupyter notebook, so all I need to focus on now is how to deliver this information in the best visualisation possible (often with slides picking the most valuable insights) and how to use those plain statistics for their operations.



  • ETL / data pipelines : data engineering


Once the input data has been validated and a proof-of-concept might be working locally - or in a dedicated machine, comes the time to approach the DevOps team to integrate my data flows within the structure of their applications. I often go for an Airflow instance that will enable me to write data pipelines between their source (of truth). Even though it is now more common to find standard open-sourced airflow operators, AI is today a great addition to Stack Overflow when it comes to the development of custom operators. Additionally, monitoring tasks which were left the "whenever we have time, just make it work for now" sprint, are now easily included from the very first commit thanks to an exhaustive periodic review of the data transformation. While libraries such as great-expectations used to help us with the structure of this monitoring, AI is now enable to help us design custom expectations, in relation with the content of the data in addition to its structure. Not only missing values are spotted, but deviations to standard values, which significantly augments the added-value of those tests, and therefore the reliability of our data as a whole.


Find an example below of an airflow tasks that validates the expectations on three datasets columns: ID is not null, age is between 18 and 100, and mean score is between 70 and 100. The first two expectations are related to data quality, whereas the third one might help identify deviation from standard values expected.

import great_expectations as ge

def validate_data(df):

	ge_df = ge.from_pandas(data)
	
	# Define expectations
	ge_df.expect_column_values_to_not_be_null("id")
	ge_df.expect_column_values_to_be_between("age", 18, 100, mostly=0.9)
    ge_df.expect_column_mean_to_be_between("score", 70, 100)

    # Validate expectations
    results = ge_df.validate()
    
    # Raise an error if validation fails
    if not results["success"]:
        raise ValueError("Data validation failed!")

# Airflow Task
validate_task = PythonOperator(
    task_id="validate_data_task",
    python_callable=validate_data,
    dag=dag
)

validate_task

Again, whether the input data HAVE TO match some expectations in order to be usable (e.g because some algorithms do not accept missing values) or it SHOULD BE in a certain range to be in line with business objectives, is where the Data Solution Architect's role comes into play. Designing those expectations with the stakeholders is a key step to ensure that the data product is working as expected by its human operators, and to monitor the quality of the results. 



  • Prediction / recommendation : data science


As I mentioned in the introduction, the time-to-insights has drastically improved over the past years. Not only do we not need to wait long for accessing results, but low-cost machines are also available on demand, leaving almost anyone with the tools to play around with machine learning. Ultimately, it enables data scientists to test, compare, and make models evolve a lot faster. Additionally, dedicated interfaces have appeared in (Azure, Google, AWS) to provide a visual way to setup sandboxes and algorithms. While this would theoretically empower lots engineers and stakeholders with no mathematical training to run some X.fit(), set up an API and let predictions flow, the reason why we haven’t seen models appear in any application is that it does require some deep understanding of the underlying models in order to go from “let’s implement some AI in our application” to have an effective model in production. The same way data exploration, validation and transfer are to be thought through and monitored, the “simply build a model and we’ll see” doesn’t quite just work.


Firstly, the model selection is backed up by the qualities of the input data : volume (scarcity, cold-start issues), variety (representativity, unknown contexts, sound/image handling), velocity (variance over time, time to process, input throughput) and veracity (noise, biais) are aspects to be considered carefully.


An extensive variety of tools has been democratized over the past decades, in pre- and post-processing (re-sampling, re-weightening, data augmentation // calibration, disparate impact analysis) as well as in the models themselves (fair algorithms, constraint-based learning). Finally, AI-powered explainability tools (e.g., SHAP, LIME) help understand and mitigate the bias that might creep back.


A simple example that demonstrates the importance of bias-mitigation in relation to the applicative context of a recommendation algorithm in the context of hiring: in some countries, over certain period of time, if we train a model on past data, including the gender dimension, it might only offer matching on nurse jobs to women, which is something that would be ethically very wrong. This is where the role of the Data Product Manager comes into play: ensuring that all those mathematical potential pitfalls do not hinder but enhance the user experience and keep the application in scope.


  1. AI allows me to focus on the human part of my job


Are they going to take over our jobs ? My simple answer is no. We have seen generations of humans that had free access to libraries, and still don't know how to boil an egg. Personal washing machines appeared in the mid-1800, and yet we estimate over a million laundromat worldwide. Auto-spelling has been on our laptop for decades, and yet I receive messages full of typos on a daily basis. Why?


  • Because most humans aren't trained to use AI


For those born in the 90's do you remember in high school the frustrating time allocated on computers, where an 80 year old library book-keeper would teach us how to use keywords to search for a book? I do. I also remembered that most of the class was playing Minesweeper instead. Those same classmates are the ones text me "could you send me the code for Netflix again" for the 5th time this month instead of using the "search" function" in WhatsApp.


The greater and faster the answers become, the more important the questions we ask will be. Did our great-grandparents need to worry about learning how to navigate Tokyo’s subway? Did our parents have to choose between a large variety of electricity providers? Since we were given the chance to vote, our responsibility to make informed choices has been growing. It is urgent to train our younger (and older!) generation on using those tools properly: what to give them to train, what to ask, how to ask to get a non-biased and reliable answer.


  • Because applied AI can’t be a black box


Once we have our first model in production, monitoring of the output together with the stakeholders (product team) ensures that we provide the expected results, in particular in contexts that the training set might not have been aware of. It is cruscial to validate the scope of the data product in order to avoid any misuse by end-users that might lead to strange recommandations / insights or have dangerous impact in the case of operational applications. For instance, I have once trained a model to decide where and when to surf in France based on weather forecast. While traveling in Asia, I thought about reusing the same application, but it might have led me straight to the hospital, as Indonesian wave often break on reef (as opposed to sand banks in France) and my algorithm was not aware of the tide parameter, thus encouraging me to go out on 10cm deep waters.


A solid monitoring of the results exposed to our end-users will ensure that if the first version of a model is not perfect, we develop it iteratively towards a fairer and more efficient version. Now the question rises: when do we update the model? and how do we decide that a new version is better than the older one? Here again, data scientist should be led by the stakeholders in the evolution of the training set, to include various scenarios and expectations based on user testing. The methodology there will look similar to product iterative development where we loop on the following key steps : discovery, implementation, impact assessment.


  • Because high-tech requires well designed interfaces


We now have applications that generate text and images based on speech/text inquiries and we see some integrations of AI modules appearing in our web/mobiles app to facilitate some specific functions (such as CV parsing into profiles for hiring platforms). The difference between those two types of usage is the level of application that we use the AI component for. In the second example, a model alone is far from enough. Designers craft the user experience, developers implement the screens, product managers coordinate it all and conduct user tests, and business developers find way to monetize the platform, finding a balance between the added value for the end-users and the coverage of their costs. Remember as a user that if it’s free, you’re the product!


A good Data Product Manager will understand how to infer the business objectives of the company into the AI modules created. Whether the goal is to save time for the end-users, money for operations, pain in any process, clear KPIs and variables will be set in order to make sure that the user experience benefits from the tools we have at hand, rather than confusingly open up the horizon of possibilities.


  • Because AI is expensive


Did you know that a simple inquiry on Chat GPT is estimated to emit 1 to 5 g CO₂ - ask 50 questions in a day and you'll produce the same emissions as 15 cups of coffee - scale this to the usage machines such as lead scoring engines and recommendation systems need to operate, and you will end up with a serious electricity bill. The machine and human power required to train and operate the application we know and so easily access in our phone is actually not negligible. Count at least 6 months of a team of 2 (data scientist + product manager + devops + developers) to productionize a simple AI module, which represents roughly 120,000 euros. CEOs that want to include AI in their roadmap easily outnumbers friends who once told you “so I have this idea for an app”. Even though we highly appreciate the hassle that those products can free us from, our ecological conscience and budgets should carefully compute the ROI of those investments.


The role of the Data Product Manager is to include a framework for impact measurement in the intial feasibility and discovery phases of the project. We typically set user expectations (reduce task performance time by 40%, reduce error rate by 80%, etc.) as well as operational SLAs (response time, cost per prediction, etc.) and test our models as soon as possible against those targets in order to ensure the viability of the application.


  • Because AI needs maintenance and regulations


Machine Learning means that the machine is updating its knowledge based a feedback on its output. This feedback loop is key for the improvement of the model, for handling new contexts that might appear, and to prevent from harmful deviations. And if the model is meant to evolve, it means we can’t “just let it run in prod”. It requires to be watched, fed, monitored, sometimes rescope or repaired.


Additionally, any model, especially in our European context, would require to comply with certain regulations and guidelines. While the implementation of GDPR measures is now relatively straightforward for developers, frameworks assessing aspects such as data privacy, biais, are still under development. Because the regulatory landscape is complex and continuously changing, the DPO has to be aware of those aspects and be able to translate them into technical specifications.


  • Because AI requires us to change, and we hate change


When hired for scoring leads, the sales team fears me and refuses to input their data into salesforce. When hired for auditing operations, workforce would stop logging errors for avoiding to be pinned. When hired to implement OKRs, all employees start thinking I’m personally responsible for the low bonuses this year. Don’t shoot the messenger? Data practitioners learn that the way to deliver (good and bad) news is 90% of the job.


Having a Data Product Manager can help facilitate the discussion between stakeholders and the developers. Instead of brute-forcing algorithms that “do our job”, we listen to operators, understand their painpoints and how information could bring more value to the time they spend in the office. The goal really is to eradicate dehumanising copy/pasting tasks and replace them with critical thinking and brainstorming time. We believe in the human added-value more than the power of the machine.



"L’homme est l’être qui est capable d’action." "Man is the being capable of action." Hannah Arendt, La Condition de l’homme moderne (1958)

Many philosophers encourage us to leave our limiting beliefs behind and endorse the only role and purpose given to us as beings: wake up, make your bed, live, take action. I believe the new area that we are thrown into thanks to AI is scary, but embracing it is the greatest opportunity we have to elevate our actions to a level where the beauty of human creativity is celebrated rather than the skill to blindly repeat actions that others or systems decide for us. That being said, don't forget to educate yourself (or seek for help) to get the most of those tools, avoid getting fooled by those who set it up, and use them in the most efficient and ethical way possible.


 
 
 

Comments


Sixtine Vervial - Data Services
French Auto-Entrepreneur
SIRET 80897424000018

All pictures taken from real travel stories - subject to copyright

  • GitHub-Mark

©2018 by Sixtine Vervial

bottom of page