{"id":1450,"date":"2026-01-20T12:04:31","date_gmt":"2026-01-20T11:04:31","guid":{"rendered":"https:\/\/giulia-governatori.alwaysdata.net\/?p=1450"},"modified":"2026-01-20T12:11:02","modified_gmt":"2026-01-20T11:11:02","slug":"i-tried-to-use-chatgpts-architecture-to-predict-electricity-consumption-heres-what-happened","status":"publish","type":"post","link":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/i-tried-to-use-chatgpts-architecture-to-predict-electricity-consumption-heres-what-happened\/","title":{"rendered":"J'ai voulu utiliser l'architecture de ChatGPT pour pr\u00e9dire la consommation \u00e9lectrique - voici ce qui s'est pass\u00e9"},"content":{"rendered":"<p>Comme beaucoup de data scientists d\u00e9butants, j'\u00e9tais persuad\u00e9e que les technologies r\u00e9centes surpassaient forc\u00e9ment les anciennes. Les Transformers, ces r\u00e9seaux de neurones qui font tourner ChatGPT et r\u00e9volutionnent l'intelligence artificielle depuis 2017, devaient logiquement \u00e9craser les LSTM, une architecture invent\u00e9e en 1997. Vingt ans d'\u00e9cart, des milliards de dollars d'investissement, des publications scientifiques par milliers \u2014 le match semblait jou\u00e9 d'avance. Dans mon cas, \u00e7a ne s'est pas pass\u00e9 comme pr\u00e9vu.<\/p>\n\n\n<style>.wp-block-kadence-advancedheading.kt-adv-heading1450_2bb75d-c7, .wp-block-kadence-advancedheading.kt-adv-heading1450_2bb75d-c7[data-kb-block=\"kb-adv-heading1450_2bb75d-c7\"]{font-style:normal;}.wp-block-kadence-advancedheading.kt-adv-heading1450_2bb75d-c7 mark.kt-highlight, .wp-block-kadence-advancedheading.kt-adv-heading1450_2bb75d-c7[data-kb-block=\"kb-adv-heading1450_2bb75d-c7\"] mark.kt-highlight{font-style:normal;color:#f76a0c;-webkit-box-decoration-break:clone;box-decoration-break:clone;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;}.wp-block-kadence-advancedheading.kt-adv-heading1450_2bb75d-c7 img.kb-inline-image, .wp-block-kadence-advancedheading.kt-adv-heading1450_2bb75d-c7[data-kb-block=\"kb-adv-heading1450_2bb75d-c7\"] img.kb-inline-image{width:150px;vertical-align:baseline;}<\/style>\n<h2 class=\"kt-adv-heading1450_2bb75d-c7 wp-block-kadence-advancedheading\" data-kb-block=\"kb-adv-heading1450_2bb75d-c7\">Le d\u00e9fi : anticiper la consommation pour optimiser les achats d'\u00e9nergie<\/h2>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"alignleft size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"945\" height=\"591\" src=\"https:\/\/giulia-governatori.alwaysdata.net\/wp-content\/uploads\/1-1.webp\" alt=\"\" class=\"wp-image-1454\" style=\"width:307px;height:auto\" srcset=\"https:\/\/giulia-governatori.alwaysdata.net\/wp-content\/uploads\/1-1.webp 945w, https:\/\/giulia-governatori.alwaysdata.net\/wp-content\/uploads\/1-1-600x375.webp 600w, https:\/\/giulia-governatori.alwaysdata.net\/wp-content\/uploads\/1-1-150x94.webp 150w, https:\/\/giulia-governatori.alwaysdata.net\/wp-content\/uploads\/1-1-768x480.webp 768w, https:\/\/giulia-governatori.alwaysdata.net\/wp-content\/uploads\/1-1-18x12.webp 18w\" sizes=\"auto, (max-width: 945px) 100vw, 945px\" \/><\/figure>\n<\/div>\n\n\n<p>Pour mon tout premier projet de deep learning, j'ai travaill\u00e9 sur un cas concret : pr\u00e9dire la consommation \u00e9lectrique d'un foyer 24 heures \u00e0 l'avance. Derri\u00e8re ce probl\u00e8me technique se cache un enjeu \u00e9conomique majeur. Les op\u00e9rateurs de smart grids ach\u00e8tent chaque jour de l'\u00e9lectricit\u00e9 sur le march\u00e9 spot europ\u00e9en (EPEX) pour livraison le lendemain. Une pr\u00e9vision trop haute, c'est du gaspillage. Trop basse, ce sont des p\u00e9nalit\u00e9s. Dans le sc\u00e9nario fictif de mon projet, ces erreurs repr\u00e9sentaient 62 millions d'euros de pertes annuelles.<\/p>\n\n\n\n<p>J'ai utilis\u00e9 un dataset r\u00e9el de l'UCI : quatre ann\u00e9es de mesures minute par minute d'un foyer \u00e0 Sceaux, en banlieue parisienne. Apr\u00e8s agr\u00e9gation horaire et feature engineering \u2014 variables m\u00e9t\u00e9o, encodage cyclique des heures, lags de consommation \u2014 j'avais 34 000 heures de donn\u00e9es et 34 variables pr\u00e9dictives.<\/p>\n\n\n<style>.wp-block-kadence-advancedheading.kt-adv-heading1450_9517d4-c3, .wp-block-kadence-advancedheading.kt-adv-heading1450_9517d4-c3[data-kb-block=\"kb-adv-heading1450_9517d4-c3\"]{font-style:normal;}.wp-block-kadence-advancedheading.kt-adv-heading1450_9517d4-c3 mark.kt-highlight, .wp-block-kadence-advancedheading.kt-adv-heading1450_9517d4-c3[data-kb-block=\"kb-adv-heading1450_9517d4-c3\"] mark.kt-highlight{font-style:normal;color:#f76a0c;-webkit-box-decoration-break:clone;box-decoration-break:clone;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;}.wp-block-kadence-advancedheading.kt-adv-heading1450_9517d4-c3 img.kb-inline-image, .wp-block-kadence-advancedheading.kt-adv-heading1450_9517d4-c3[data-kb-block=\"kb-adv-heading1450_9517d4-c3\"] img.kb-inline-image{width:150px;vertical-align:baseline;}<\/style>\n<h2 class=\"kt-adv-heading1450_9517d4-c3 wp-block-kadence-advancedheading\" data-kb-block=\"kb-adv-heading1450_9517d4-c3\">Le face-\u00e0-face : LSTM contre Transformer<\/h2>\n\n\n\n<p>J'ai construit deux mod\u00e8les. Le LSTM, d'abord : deux couches, 38 000 param\u00e8tres, architecture classique mais \u00e9prouv\u00e9e. Puis le Transformer : m\u00e9canisme d'attention, encodage positionnel, 70 000 param\u00e8tres \u2014 l'artillerie lourde.<\/p>\n\n\n\n<p>Les r\u00e9sultats bruts semblaient donner raison au Transformer. Son erreur moyenne (MAE) atteignait 0.4086 kW contre 0.4145 kW pour le LSTM. Un avantage de 1.4%. Victoire ? Pas si vite.<\/p>\n\n\n<style>.wp-block-kadence-advancedheading.kt-adv-heading1450_68a896-f4, .wp-block-kadence-advancedheading.kt-adv-heading1450_68a896-f4[data-kb-block=\"kb-adv-heading1450_68a896-f4\"]{font-style:normal;}.wp-block-kadence-advancedheading.kt-adv-heading1450_68a896-f4 mark.kt-highlight, .wp-block-kadence-advancedheading.kt-adv-heading1450_68a896-f4[data-kb-block=\"kb-adv-heading1450_68a896-f4\"] mark.kt-highlight{font-style:normal;color:#f76a0c;-webkit-box-decoration-break:clone;box-decoration-break:clone;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;}.wp-block-kadence-advancedheading.kt-adv-heading1450_68a896-f4 img.kb-inline-image, .wp-block-kadence-advancedheading.kt-adv-heading1450_68a896-f4[data-kb-block=\"kb-adv-heading1450_68a896-f4\"] img.kb-inline-image{width:150px;vertical-align:baseline;}<\/style>\n<h2 class=\"kt-adv-heading1450_68a896-f4 wp-block-kadence-advancedheading\" data-kb-block=\"kb-adv-heading1450_68a896-f4\">La m\u00e9trique qui change tout : l'overfitting<\/h2>\n\n\n\n<p>En creusant les r\u00e9sultats, j'ai d\u00e9couvert un signal d'alarme. L'\u00e9cart entre la performance en entra\u00eenement et en test \u2014 ce qu'on appelle la pourcentage d'overfitting ((validation_loss \u2212 train_loss)\/train_loss)) \u2014 atteignait 6.62% pour le Transformer contre seulement 1.63% pour le LSTM. Autrement dit, le Transformer avait tendance \u00e0 m\u00e9moriser les donn\u00e9es d'entra\u00eenement plut\u00f4t qu'\u00e0 apprendre les vrais patterns sous-jacents.<\/p>\n\n\n\n<p>En production, face \u00e0 des donn\u00e9es in\u00e9dites, ce comportement peut \u00eatre catastrophique. Un mod\u00e8le qui g\u00e9n\u00e9ralise mal est un mod\u00e8le dangereux.<\/p>\n\n\n\n<p>De plus, le LSTM s'entra\u00eenait en moins de 2 minutes contre plus de 7 pour le Transformer. Plus simple, plus rapide, plus robuste : le choix \u00e9tait fait.<\/p>\n\n\n<style>.wp-block-kadence-advancedheading.kt-adv-heading1450_3ce96e-dd, .wp-block-kadence-advancedheading.kt-adv-heading1450_3ce96e-dd[data-kb-block=\"kb-adv-heading1450_3ce96e-dd\"]{font-style:normal;}.wp-block-kadence-advancedheading.kt-adv-heading1450_3ce96e-dd mark.kt-highlight, .wp-block-kadence-advancedheading.kt-adv-heading1450_3ce96e-dd[data-kb-block=\"kb-adv-heading1450_3ce96e-dd\"] mark.kt-highlight{font-style:normal;color:#f76a0c;-webkit-box-decoration-break:clone;box-decoration-break:clone;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;}.wp-block-kadence-advancedheading.kt-adv-heading1450_3ce96e-dd img.kb-inline-image, .wp-block-kadence-advancedheading.kt-adv-heading1450_3ce96e-dd[data-kb-block=\"kb-adv-heading1450_3ce96e-dd\"] img.kb-inline-image{width:150px;vertical-align:baseline;}<\/style>\n<h2 class=\"kt-adv-heading1450_3ce96e-dd wp-block-kadence-advancedheading\" data-kb-block=\"kb-adv-heading1450_3ce96e-dd\">Ce que j'en retiens<\/h2>\n\n\n\n<p>Dans ce contexte pr\u00e9cis \u2014 un seul foyer, quatre ann\u00e9es de donn\u00e9es, des patterns de consommation relativement r\u00e9guliers \u2014 le LSTM s'est r\u00e9v\u00e9l\u00e9 plus adapt\u00e9. Le Transformer a probablement besoin de datasets plus volumineux ou de s\u00e9quences plus complexes pour exprimer tout son potentiel. C'est d'ailleurs ce que sugg\u00e8re la litt\u00e9rature scientifique : l'attention brille sur les longues d\u00e9pendances et les grands volumes de donn\u00e9es, moins sur les s\u00e9ries temporelles courtes et structur\u00e9es.<\/p>\n\n\n\n<p>Mon mod\u00e8le LSTM final permet d'\u00e9conomiser environ 28 millions d'euros par an en r\u00e9duisant les erreurs de pr\u00e9vision de 45%. Je suis fi\u00e8re de ce r\u00e9sultat pour un premier projet deep learning.<\/p>\n\n\n<style>.wp-block-kadence-advancedheading.kt-adv-heading1450_2da5a2-63, .wp-block-kadence-advancedheading.kt-adv-heading1450_2da5a2-63[data-kb-block=\"kb-adv-heading1450_2da5a2-63\"]{font-style:normal;}.wp-block-kadence-advancedheading.kt-adv-heading1450_2da5a2-63 mark.kt-highlight, .wp-block-kadence-advancedheading.kt-adv-heading1450_2da5a2-63[data-kb-block=\"kb-adv-heading1450_2da5a2-63\"] mark.kt-highlight{font-style:normal;color:#f76a0c;-webkit-box-decoration-break:clone;box-decoration-break:clone;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;}.wp-block-kadence-advancedheading.kt-adv-heading1450_2da5a2-63 img.kb-inline-image, .wp-block-kadence-advancedheading.kt-adv-heading1450_2da5a2-63[data-kb-block=\"kb-adv-heading1450_2da5a2-63\"] img.kb-inline-image{width:150px;vertical-align:baseline;}<\/style>\n<h2 class=\"kt-adv-heading1450_2da5a2-63 wp-block-kadence-advancedheading\" data-kb-block=\"kb-adv-heading1450_2da5a2-63\">Un appel aux experts<\/h2>\n\n\n\n<p>Cela dit, je reste une d\u00e9butante dans ce domaine. Si vous \u00eates un professionnel exp\u00e9riment\u00e9 et que vous voyez des pistes d'am\u00e9lioration \u2014 sur l'architecture du Transformer, les hyperparam\u00e8tres, ou la strat\u00e9gie d'entra\u00eenement \u2014 je serais ravie d'en discuter. Le notebook complet et la m\u00e9thodologie d\u00e9taill\u00e9e sont disponibles <strong><a href=\"https:\/\/giulia-governatori.alwaysdata.net\/fr\/projects\/lstm-vs-transformer-next-day-energy-forecasting-on-smart-grid\/\" data-type=\"projects\" data-id=\"1437\">ici<\/a><\/strong>. N'h\u00e9sitez pas \u00e0 me contacter, par mail (giuliagovernatori@hotmail.com) ou en commentant mon post linkedin (link).<\/p>\n\n\n\n<p><\/p>","protected":false},"excerpt":{"rendered":"<p>Like many beginner data scientists, I was convinced that recent technologies inevitably outperformed older ones. Transformers, the neural networks powering ChatGPT that have been revolutionizing artificial intelligence since 2017, should logically crush LSTMs, an architecture invented in 1997. Twenty years apart, billions of dollars in investment, thousands of scientific publications \u2014 the match seemed decided&#8230;<\/p>","protected":false},"author":1,"featured_media":1451,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-1450","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-non-classe"],"_links":{"self":[{"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/posts\/1450","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/comments?post=1450"}],"version-history":[{"count":3,"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/posts\/1450\/revisions"}],"predecessor-version":[{"id":1456,"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/posts\/1450\/revisions\/1456"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/media\/1451"}],"wp:attachment":[{"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/media?parent=1450"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/categories?post=1450"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/giulia-governatori.alwaysdata.net\/fr\/wp-json\/wp\/v2\/tags?post=1450"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}