a.heateor_sss_amp{padding:0 4px}div.heateor_sss_horizontal_sharing a amp-img{display:inline-block}.heateor_sss_amp_instagram img{background-color:#624E47}.heateor_sss_amp_yummly img{background-color:#E16120}.heateor_sss_amp_youtube img{background-color:#ff0000}.heateor_sss_amp_buffer img{background-color:#000}.heateor_sss_amp_delicious img{background-color:#53BEEE}.heateor_sss_amp_facebook img{background-color:#3C589A}.heateor_sss_amp_digg img{background-color:#006094}.heateor_sss_amp_email img{background-color:#649A3F}.heateor_sss_amp_float_it img{background-color:#53BEEE}.heateor_sss_amp_linkedin img{background-color:#0077B5}.heateor_sss_amp_pinterest img{background-color:#CC2329}.heateor_sss_amp_print img{background-color:#FD6500}.heateor_sss_amp_reddit img{background-color:#FF5700}.heateor_sss_amp_stocktwits img{background-color:#40576F}.heateor_sss_amp_mewe img{background-color:#007da1}.heateor_sss_amp_mix img{background-color:#ff8226}.heateor_sss_amp_tumblr img{background-color:#29435D}.heateor_sss_amp_twitter img{background-color:#55acee}.heateor_sss_amp_vkontakte img{background-color:#5E84AC}.heateor_sss_amp_yahoo img{background-color:#8F03CC}.heateor_sss_amp_xing img{background-color:#00797D}.heateor_sss_amp_instagram img{background-color:#527FA4}.heateor_sss_amp_whatsapp img{background-color:#55EB4C}.heateor_sss_amp_aim img{background-color:#10ff00}.heateor_sss_amp_amazon_wish_list img{background-color:#ffe000}.heateor_sss_amp_aol_mail img{background-color:#2A2A2A}.heateor_sss_amp_app_net img{background-color:#5D5D5D}.heateor_sss_amp_baidu img{background-color:#2319DC}.heateor_sss_amp_balatarin img{background-color:#fff}.heateor_sss_amp_bibsonomy img{background-color:#000}.heateor_sss_amp_bitty_browser img{background-color:#EFEFEF}.heateor_sss_amp_blinklist img{background-color:#3D3C3B}.heateor_sss_amp_blogger_post img{background-color:#FDA352}.heateor_sss_amp_blogmarks img{background-color:#535353}.heateor_sss_amp_bookmarks_fr img{background-color:#E8EAD4}.heateor_sss_amp_box_net img{background-color:#1A74B0}.heateor_sss_amp_buddymarks img{background-color:#ffd400}.heateor_sss_amp_care2_news img{background-color:#6EB43F}.heateor_sss_amp_citeulike img{background-color:#2781CD}.heateor_sss_amp_comment img{background-color:#444}.heateor_sss_amp_diary_ru img{background-color:#E8D8C6}.heateor_sss_amp_diaspora img{background-color:#2E3436}.heateor_sss_amp_dihitt img{background-color:#FF6300}.heateor_sss_amp_diigo img{background-color:#4A8BCA}.heateor_sss_amp_douban img{background-color:#497700}.heateor_sss_amp_draugiem img{background-color:#ffad66}.heateor_sss_amp_dzone img{background-color:#fff088}.heateor_sss_amp_evernote img{background-color:#8BE056}.heateor_sss_amp_facebook_messenger img{background-color:#0084FF}.heateor_sss_amp_fark img{background-color:#555}.heateor_sss_amp_fintel img{background-color:#087515}.heateor_sss_amp_flipboard img{background-color:#CC0000}.heateor_sss_amp_folkd img{background-color:#0F70B2}.heateor_sss_amp_google_classroom img{background-color:#FFC112}.heateor_sss_amp_google_bookmarks img{background-color:#CB0909}.heateor_sss_amp_google_gmail img{background-color:#E5E5E5}.heateor_sss_amp_hacker_news img{background-color:#F60}.heateor_sss_amp_hatena img{background-color:#00A6DB}.heateor_sss_amp_instapaper img{background-color:#EDEDED}.heateor_sss_amp_jamespot img{background-color:#FF9E2C}.heateor_sss_amp_kakao img{background-color:#FCB700}.heateor_sss_amp_kik img{background-color:#2A2A2A}.heateor_sss_amp_kindle_it img{background-color:#2A2A2A}.heateor_sss_amp_known img{background-color:#fff101}.heateor_sss_amp_line img{background-color:#00C300}.heateor_sss_amp_livejournal img{background-color:#EDEDED}.heateor_sss_amp_mail_ru img{background-color:#356FAC}.heateor_sss_amp_mendeley img{background-color:#A70805}.heateor_sss_amp_meneame img{background-color:#FF7D12}.heateor_sss_amp_mixi img{background-color:#EDEDED}.heateor_sss_amp_myspace img{background-color:#2A2A2A}.heateor_sss_amp_netlog img{background-color:#2A2A2A}.heateor_sss_amp_netvouz img{background-color:#c0ff00}.heateor_sss_amp_newsvine img{background-color:#055D00}.heateor_sss_amp_nujij img{background-color:#D40000}.heateor_sss_amp_odnoklassniki img{background-color:#F2720C}.heateor_sss_amp_oknotizie img{background-color:#fdff88}.heateor_sss_amp_outlook_com img{background-color:#0072C6}.heateor_sss_amp_papaly img{background-color:#3AC0F6}.heateor_sss_amp_pinboard img{background-color:#1341DE}.heateor_sss_amp_plurk img{background-color:#CF682F}.heateor_sss_amp_pocket img{background-color:#f0f0f0}.heateor_sss_amp_polyvore img{background-color:#2A2A2A}.heateor_sss_amp_printfriendly img{background-color:#61D1D5}.heateor_sss_amp_protopage_bookmarks img{background-color:#413FFF}.heateor_sss_amp_pusha img{background-color:#0072B8}.heateor_sss_amp_qzone img{background-color:#2B82D9}.heateor_sss_amp_refind img{background-color:#1492ef}.heateor_sss_amp_rediff_mypage img{background-color:#D20000}.heateor_sss_amp_renren img{background-color:#005EAC}.heateor_sss_amp_segnalo img{background-color:#fdff88}.heateor_sss_amp_sina_weibo img{background-color:#ff0}.heateor_sss_amp_sitejot img{background-color:#ffc800}.heateor_sss_amp_skype img{background-color:#00AFF0}.heateor_sss_amp_sms img{background-color:#6ebe45}.heateor_sss_amp_slashdot img{background-color:#004242}.heateor_sss_amp_stumpedia img{background-color:#EDEDED}.heateor_sss_amp_svejo img{background-color:#fa7aa3}.heateor_sss_amp_symbaloo_feeds img{background-color:#6DA8F7}.heateor_sss_amp_telegram img{background-color:#3DA5f1}.heateor_sss_amp_trello img{background-color:#1189CE}.heateor_sss_amp_tuenti img{background-color:#0075C9}.heateor_sss_amp_twiddla img{background-color:#EDEDED}.heateor_sss_amp_typepad_post img{background-color:#2A2A2A}.heateor_sss_amp_viadeo img{background-color:#2A2A2A}.heateor_sss_amp_viber img{background-color:#8B628F}.heateor_sss_amp_wanelo img{background-color:#fff}.heateor_sss_amp_webnews img{background-color:#CC2512}.heateor_sss_amp_wordpress img{background-color:#464646}.heateor_sss_amp_wykop img{background-color:#367DA9}.heateor_sss_amp_yahoo_mail img{background-color:#400090}.heateor_sss_amp_yahoo_messenger img{background-color:#400090}.heateor_sss_amp_yoolink img{background-color:#A2C538}.heateor_sss_amp_youmob img{background-color:#3B599D}.heateor_sss_amp_gentlereader img{background-color:#46aecf}.heateor_sss_amp_threema img{background-color:#2A2A2A}.heateor_sss_vertical_sharing{position:fixed;left:11px;z-index:99999}.heateor-total-share-count .sss_share_count{color:#666;font-size:23px}.heateor-total-share-count .sss_share_lbl{color:#666}.amp-wp-enforced-sizes img[alt="Pinterest"]{background:#cc2329}.amp-wp-enforced-sizes img[alt="Viber"]{background:#8b628f}.amp-wp-enforced-sizes img[alt="Print"]{background:#fd6500}.amp-wp-enforced-sizes img[alt="Threema"]{background:#2a2a2a}.amp-wp-article-content .heateor_sss_vertical_sharing{left:5px}.amp-wp-article-content amp-img[alt="Pinterest"]{left:4px}.amp-wp-enforced-sizes img[alt="MySpace"]{background:#2a2a2a} amp-web-push-widget button.amp-subscribe { display: inline-flex; align-items: center; border-radius: 5px; border: 0; box-sizing: border-box; margin: 0; padding: 10px 15px; cursor: pointer; outline: none; font-size: 15px; font-weight: 500; background: #4A90E2; margin-top: 7px; color: white; box-shadow: 0 1px 1px 0 rgba(0, 0, 0, 0.5); -webkit-tap-highlight-color: rgba(0, 0, 0, 0); } a.heateor_sss_amp{padding:0 4px;}div.heateor_sss_horizontal_sharing a amp-img{display:inline-block;}.heateor_sss_amp_gab img{background-color:#25CC80}.heateor_sss_amp_parler img{background-color:#892E5E}.heateor_sss_amp_gettr img{background-color:#E50000}.heateor_sss_amp_instagram img{background-color:#624E47}.heateor_sss_amp_yummly img{background-color:#E16120}.heateor_sss_amp_youtube img{background-color:#ff0000}.heateor_sss_amp_teams img{background-color:#5059c9}.heateor_sss_amp_google_translate img{background-color:#528ff5}.heateor_sss_amp_x img{background-color:#2a2a2a}.heateor_sss_amp_rutube img{background-color:#14191f}.heateor_sss_amp_buffer img{background-color:#000}.heateor_sss_amp_delicious img{background-color:#53BEEE}.heateor_sss_amp_rss img{background-color:#e3702d}.heateor_sss_amp_facebook img{background-color:#0765FE}.heateor_sss_amp_digg img{background-color:#006094}.heateor_sss_amp_email img{background-color:#649A3F}.heateor_sss_amp_float_it img{background-color:#53BEEE}.heateor_sss_amp_linkedin img{background-color:#0077B5}.heateor_sss_amp_pinterest img{background-color:#CC2329}.heateor_sss_amp_print img{background-color:#FD6500}.heateor_sss_amp_reddit img{background-color:#FF5700}.heateor_sss_amp_mastodon img{background-color:#6364FF}.heateor_sss_amp_stocktwits img{background-color: #40576F}.heateor_sss_amp_mewe img{background-color:#007da1}.heateor_sss_amp_mix img{background-color:#ff8226}.heateor_sss_amp_tumblr img{background-color:#29435D}.heateor_sss_amp_twitter img{background-color:#55acee}.heateor_sss_amp_vkontakte img{background-color:#0077FF}.heateor_sss_amp_yahoo img{background-color:#8F03CC}.heateor_sss_amp_xing img{background-color:#00797D}.heateor_sss_amp_instagram img{background-color:#527FA4}.heateor_sss_amp_whatsapp img{background-color:#55EB4C}.heateor_sss_amp_aim img{background-color: #10ff00}.heateor_sss_amp_amazon_wish_list img{background-color: #ffe000}.heateor_sss_amp_aol_mail img{background-color: #2A2A2A}.heateor_sss_amp_app_net img{background-color: #5D5D5D}.heateor_sss_amp_balatarin img{background-color: #fff}.heateor_sss_amp_bibsonomy img{background-color: #000}.heateor_sss_amp_bitty_browser img{background-color: #EFEFEF}.heateor_sss_amp_blinklist img{background-color: #3D3C3B}.heateor_sss_amp_blogger_post img{background-color: #FDA352}.heateor_sss_amp_blogmarks img{background-color: #535353}.heateor_sss_amp_bookmarks_fr img{background-color: #E8EAD4}.heateor_sss_amp_box_net img{background-color: #1A74B0}.heateor_sss_amp_buddymarks img{background-color: #ffd400}.heateor_sss_amp_care2_news img{background-color: #6EB43F}.heateor_sss_amp_comment img{background-color: #444}.heateor_sss_amp_diary_ru img{background-color: #E8D8C6}.heateor_sss_amp_diaspora img{background-color: #2E3436}.heateor_sss_amp_dihitt img{background-color: #FF6300}.heateor_sss_amp_diigo img{background-color: #4A8BCA}.heateor_sss_amp_douban img{background-color: #497700}.heateor_sss_amp_draugiem img{background-color: #ffad66}.heateor_sss_amp_evernote img{background-color: #8BE056}.heateor_sss_amp_facebook_messenger img{background-color: #0084FF}.heateor_sss_amp_fark img{background-color: #555}.heateor_sss_amp_fintel img{background-color: #087515}.heateor_sss_amp_flipboard img{background-color: #CC0000}.heateor_sss_amp_folkd img{background-color: #0F70B2}.heateor_sss_amp_google_news img{background-color: #4285F4}.heateor_sss_amp_google_classroom img{background-color: #FFC112}.heateor_sss_amp_google_gmail img{background-color: #E5E5E5}.heateor_sss_amp_hacker_news img{background-color: #F60}.heateor_sss_amp_hatena img{background-color: #00A6DB}.heateor_sss_amp_instapaper img{background-color: #EDEDED}.heateor_sss_amp_jamespot img{background-color: #FF9E2C}.heateor_sss_amp_kakao img{background-color: #FCB700}.heateor_sss_amp_kik img{background-color: #2A2A2A}.heateor_sss_amp_kindle_it img{background-color: #2A2A2A}.heateor_sss_amp_known img{background-color: #fff101}.heateor_sss_amp_line img{background-color: #00C300}.heateor_sss_amp_livejournal img{background-color: #EDEDED}.heateor_sss_amp_mail_ru img{background-color: #356FAC}.heateor_sss_amp_mendeley img{background-color: #A70805}.heateor_sss_amp_meneame img{background-color: #FF7D12}.heateor_sss_amp_mixi img{background-color: #EDEDED}.heateor_sss_amp_myspace img{background-color: #2A2A2A}.heateor_sss_amp_netlog img{background-color: #2A2A2A}.heateor_sss_amp_netvouz img{background-color: #c0ff00}.heateor_sss_amp_newsvine img{background-color: #055D00}.heateor_sss_amp_nujij img{background-color: #D40000}.heateor_sss_amp_odnoklassniki img{background-color: #F2720C}.heateor_sss_amp_oknotizie img{background-color: #fdff88}.heateor_sss_amp_outlook_com img{background-color: #0072C6}.heateor_sss_amp_papaly img{background-color: #3AC0F6}.heateor_sss_amp_pinboard img{background-color: #1341DE}.heateor_sss_amp_plurk img{background-color: #CF682F}.heateor_sss_amp_pocket img{background-color: #ee4056}.heateor_sss_amp_polyvore img{background-color: #2A2A2A}.heateor_sss_amp_printfriendly img{background-color: #61D1D5}.heateor_sss_amp_protopage_bookmarks img{background-color: #413FFF}.heateor_sss_amp_pusha img{background-color: #0072B8}.heateor_sss_amp_qzone img{background-color: #2B82D9}.heateor_sss_amp_refind img{background-color: #1492ef}.heateor_sss_amp_rediff_mypage img{background-color: #D20000}.heateor_sss_amp_renren img{background-color: #005EAC}.heateor_sss_amp_segnalo img{background-color: #fdff88}.heateor_sss_amp_sina_weibo img{background-color: #ff0}.heateor_sss_amp_sitejot img{background-color: #ffc800}.heateor_sss_amp_skype img{background-color: #00AFF0}.heateor_sss_amp_sms img{background-color: #6ebe45}.heateor_sss_amp_slashdot img{background-color: #004242}.heateor_sss_amp_stumpedia img{background-color: #EDEDED}.heateor_sss_amp_svejo img{background-color: #fa7aa3}.heateor_sss_amp_symbaloo_feeds img{background-color: #6DA8F7}.heateor_sss_amp_telegram img{background-color: #3DA5f1}.heateor_sss_amp_trello img{background-color: #1189CE}.heateor_sss_amp_tuenti img{background-color: #0075C9}.heateor_sss_amp_twiddla img{background-color: #EDEDED}.heateor_sss_amp_typepad_post img{background-color: #2A2A2A}.heateor_sss_amp_viadeo img{background-color: #2A2A2A}.heateor_sss_amp_viber img{background-color: #8B628F}.heateor_sss_amp_webnews img{background-color: #CC2512}.heateor_sss_amp_wordpress img{background-color: #464646}.heateor_sss_amp_wykop img{background-color: #367DA9}.heateor_sss_amp_yahoo_mail img{background-color: #400090}.heateor_sss_amp_yahoo_messenger img{background-color: #400090}.heateor_sss_amp_yoolink img{background-color: #A2C538}.heateor_sss_amp_youmob img{background-color: #3B599D}.heateor_sss_amp_gentlereader img{background-color: #46aecf}.heateor_sss_amp_threema img{background-color: #2A2A2A}.heateor_sss_amp_bluesky img{background-color:#0085ff}.heateor_sss_amp_threads img{background-color:#000}.heateor_sss_amp_raindrop img{background-color:#0b7ed0}.heateor_sss_amp_micro_blog img{background-color:#ff8800}.heateor_sss_amp amp-img{border-radius:999px;} .amp-logo amp-img{width:190px} .amp-menu input{display:none;}.amp-menu li.menu-item-has-children ul{display:none;}.amp-menu li{position:relative;display:block;}.amp-menu > li a{display:block;} /* Inline styles */ div.acsse3e3c{font-weight:bold;}amp-img.acss334b9{max-width:35px;}div.acss138d7{clear:both;}div.acssf5b84{--relposth-columns:3;--relposth-columns_m:2;--relposth-columns_t:2;}div.acss43dca{aspect-ratio:1/1;background:transparent url(https://aiofm.net/wp-content/uploads/2024/03/Researchers-create-voluntary-safety-rules-for-AI-protein-design-150x150.jpg) no-repeat scroll 0% 0%;height:150px;max-width:150px;}div.acss6bdea{color:#333333;font-family:Arial;font-size:12px;height:75px;}div.acss90036{aspect-ratio:1/1;background:transparent url(https://aiofm.net/wp-content/uploads/2025/03/Brics-Russia-adopts-the-cryptos-for-exporting-oil-to-India-150x150.png) no-repeat scroll 0% 0%;height:150px;max-width:150px;}div.acss3a225{aspect-ratio:1/1;background:transparent url(https://aiofm.net/wp-content/uploads/2024/02/The-Vulnerabilities-and-Security-Threats-Facing-Large-Language-Models.webp-150x150.webp) no-repeat scroll 0% 0%;height:150px;max-width:150px;}div.acssc0b7d{-webkit-box-shadow:none;box-shadow:none;left:-10px;top:100px;max-width:44px;}amp-img.acss8b671{max-width:40px;} .icon-widgets:before {content: "\e1bd";}.icon-search:before {content: "\e8b6";}.icon-shopping-cart:after {content: "\e8cc";}
In late 2023, a team of third-party researchers discovered a troubling glitch in OpenAI’s widely used artificial intelligence model GPT-3.5.
When asked to repeat certain words a thousand times, the model began repeating the word over and over, then suddenly switched to spitting out incoherent text and snippets of personal information drawn from its training data, including parts of names, phone numbers, and email addresses. The team that discovered the problem worked with OpenAI to ensure the flaw was fixed before revealing it publicly. It is just one of scores of problems found in major AI models in recent years.
In a proposal released today, more than 30 prominent AI researchers, including some who found the GPT-3.5 flaw, say that many other vulnerabilities affecting popular models are reported in problematic ways. They suggest a new scheme supported by AI companies that gives outsiders permission to probe their models and a way to disclose flaws publicly.
“Right now it’s a little bit of the Wild West,” says Shayne Longpre, a PhD candidate at MIT and the lead author of the proposal. Longpre says that some so-called jailbreakers share their methods of breaking AI safeguards the social media platform X, leaving models and users at risk. Other jailbreaks are shared with only one company even though they might affect many. And some flaws, he says, are kept secret because of fear of getting banned or facing prosecution for breaking terms of use. “It is clear that there are chilling effects and uncertainty,” he says.
The security and safety of AI models is hugely important given widely the technology is now being used, and how it may seep into countless applications and services. Powerful models need to be stress-tested, or red-teamed, because they can harbor harmful biases, and because certain inputs can cause them to break free of guardrails and produce unpleasant or dangerous responses. These include encouraging vulnerable users to engage in harmful behavior or helping a bad actor to develop cyber, chemical, or biological weapons. Some experts fear that models could assist cyber criminals or terrorists, and may even turn on humans as they advance.
The authors suggest three main measures to improve the third-party disclosure process: adopting standardized AI flaw reports to streamline the reporting process; for big AI firms to provide infrastructure to third-party researchers disclosing flaws; and for developing a system that allows flaws to be shared between different providers.
The approach is borrowed from the cybersecurity world, where there are legal protections and established norms for outside researchers to disclose bugs.
“AI researchers don’t always know how to disclose a flaw and can’t be certain that their good faith flaw disclosure won’t expose them to legal risk,” says Ilona Cohen, chief legal and policy officer at HackerOne, a company that organizes bug bounties, and a coauthor on the report.
Large AI companies currently conduct extensive safety testing on AI models prior to their release. Some also contract with outside firms to do further probing. “Are there enough people in those [companies] to address all of the issues with general-purpose AI systems, used by hundreds of millions of people in applications we’ve never dreamt?” Longpre asks. Some AI companies have started organizing AI bug bounties. However, Longpre says that independent researchers risk breaking the terms of use if they take it upon themselves to probe powerful AI models.
The adoption of AI has caused an increased need for proper data governance, and companies…
OpenAI on Monday launched a new family of models called GPT-4.1. Yes, “4.1” — as…
Evan Brown serves as the Executive Director of EDGE (Economic Development Growth and Expansion) at…
All of the 400 exposed AI systems found by UpGuard have one thing in common:…
In the 90s, we collected Pokémon cards, in the 2000s, we all had a weird…
Medicaid has become a central point of a heated political battle, as Republican lawmakers push…
This website uses cookies.