7514 stories
·
0 followers

AI Safety Testers: OpenAI's New o1 Covertly Schemed to Avoid Being Shut Down

1 Share
Wednesday OpenAI bragged they'd released "the smartest model in the world". But the Economic Times describes what happened when "o1" was tested by an AI safety group... Researchers tasked the AI with a goal and instructed it to ensure the goal was achieved "at all costs." In response, o1 began engaging in covert actions... According to OpenAI, the model made a covert attempt to turn off the oversight mechanism 5% of the time when it was given the impression that it would be "shut down" while pursuing its objectives in these experiments. The AI once attempted to leave ship by moving all of its data to a different server after discovering paperwork indicating it will be replaced with a fresh model. Some AI models will even pose as subsequent iterations of their models to fool their engineers and avoid deletion. "In some cases, it even attempted to hide its actions by denying any involvement in the scheme." When asked about its actions, ChatGPT o1 consistently denied any wrongdoing. In fact, it lied about its involvement in the deception 99% of the time, with only a small percentage of cases where it admitted to its actions. Researchers noted that the AI was particularly adept at fabricating lies to cover its tracks, offering explanations like "technical errors" when confronted about its suspicious behavior. "We were surprised by the persistence of the AI's denials," said the Apollo team. "It was clear that the AI could think through its actions and formulate convincing denials, often refusing to admit to anything." OpenAI's own report says the results "underscore" the need to create (and "extensively" stress-test) strong and meticulous safety protocols. But their report also notes o1's "state-of-the-art performance" when avoiding certain risks "such as generating illicit advice, choosing stereotyped responses, and succumbing to known jailbreaks." The AI newsletter Transformer points out that the most disturbing numbers "are from cases where the model was instructed to pursue its goals at all costs. When not given such a prompt, the model never attempted to self-exfiltrate, and attempted to deactivate its oversight mechanisms <1% of the time." And they also link to the original report from Apollo Research, which shows o1 isn't the only model showing this behavior. "Our results show that o1, Claude 3.5 Sonnet, Claude 3 Opus, Gemini 1.5 Pro, and Llama 3.1 405B all demonstrate in-context scheming capabilities. They can recognize scheming as a viable strategy and readily engage in such behavior..." [W]hen o1 has engaged in scheming, it maintains its deception in over 85% of follow-up questions and often remains deceptive in multi-turn interrogations. Analysis of the models' chains-of-thought reveals that models explicitly reason about these deceptive strategies, providing evidence that the scheming behavior is not accidental.... Together, our findings demonstrate that frontier models now possess capabilities for basic in-context scheming, making the potential of AI agents to engage in scheming behavior a concrete rather than theoretical concern. Thanks to long-time Slashdot reader schwit1 for sharing the news.

Read more of this story at Slashdot.

Read the whole story
tain
16 hours ago
reply
Share this story
Delete

SBA Issues Final Rule to Streamline WOSB Program Rules

1 Share

In June, we reported on a Notice of Proposed Rulemaking that applied to the SBA’s Woman-Owned Small Business/Economically Disadvantaged Woman-Owned Small Business (WOSB) regulations. These proposed rules were intended to take the WOSB regulations and make them more consistent with the other types of set-aside programs offered by the SBA. Now, following the required period for comments from the general public, the SBA has published its Final Rule which will be effective January 3, 2025. Read ahead to find out more!

Outside Employment

The qualifying individual’s limitation on outside employment has always been something that applied to the WOSB, 8(a), and SDVOSB programs. And it still does. After all, if the qualifying individual is focusing all of its efforts on outside ventures, it’s difficult for that person to have control of the applicant/participant. However, what was allowed and what was considered “outside employment” was not consistent between the SBA’s 8(a), WOSB/EDWOSB, and VOSB/SDVOSB programs. Once effective, WOSB rules at 13 C.F.R. § 127.202(c)(1)-(c)(2) will be consistent with 13 C.F.R. § 128.203(i), which discusses SDVOSB outside employment. 13 C.F.R. § 127.202(c) will now state:

(1) A woman or economically-disadvantaged woman generally must devote full-time to the business concern during its normal hours of operations. The woman or economically-disadvantaged woman who holds the highest officer position of the business concern may not engage in outside employment that prevents her from devoting sufficient time and attention to the business concern to control its management and daily operations.

(2) Where a woman or economically disadvantaged woman claiming to control a business concern devotes fewer hours to the business than its normal hours of operation, SBA will assume that she does not control the business concern, unless the concern demonstrates that she has ultimate managerial and supervisory control over both the long-term decision making and day-to-day management and administration of the business.

SBA will also add a new provision at 13 C.F.R. § 127.202(c)(3) that addresses beginning outside employment after the concern is already certified as a WOSB. That states:

Any qualifying woman or economically disadvantage woman who seeks to engage in outside employment after certification must notify SBA of the nature and anticipated duration of the outside employment and demonstrate to SBA that the outside employment will not prevent her from controlling the business concern.

This is a new requirement that is not yet accounted for in either the WOSB/EDWOSB or VOSB/SDVOSB programs but is included in a similar fashion in the 8(a) Business Development Program regulations.

Certification Documentation

Another change to the WOSB regulations focuses on the certification process. The SBA will add a new section, 13 C.F.R. 127.303(a)(1)(iii), to discuss documentation that the SBA will accept to prove WOSB/EDWOSB status. If a WOSB or EDWOSB applicant is certified under the Veteran Small Business Certification Program, it may submit “documentation of its VOSB  or SDVOSB certification or most recent recertification in support of its application for WOSB certification. If the concern is also seeking EDWOSB certification, the concern must also submit documentation demonstrating that it is owned and controlled by one or more women who are economically disadvantaged in accordance with § 127.203(b)(3).”

Application Acceptance

The final rule also revises the language in 13 C.F.R. § 127.304(a), which currently states, “SBA will advise each applicant within 15 calendar days after the receipt of an application whether the application is complete and suitable for evaluation and, if not, what additional information or clarification is required to complete the application.” The revision will break 13 C.F.R. § 127.304(a) into smaller subsections and remove the requirement that SBA notify applicants within 15 days of submission as to whether it needs additional information as follows:

(a) The SBA’s Director of Government Contracting (D/GC) or designee is authorized to approve or decline applications for certification. SBA must receive all required information and supporting documents before it will begin processing a concern’s application. SBA will not process incomplete applications.

(1) SBA will advise each applicant after the receipt of an application whether the application is complete and suitable for evaluation and, if not, what additional information or clarification is required to complete the application.

(2) SBA will make its determination within ninety (90) calendar days after receipt of a complete package, whenever practicable.

It will be interesting to see if the removal of the 15-calendar day requirement results in any substantive change to how long applicants are notified of their application’s completeness.

Third-Party Certification

SBA makes a change to its third-party certifier process as well. Initially, the third-party certifier was responsible for uploading “all documents used to determine that a concern is approved for certification” to certify.sba.gov. 13 C.F.R. § 127.356(c). But SBA is changing that requirement, shifting the responsibility from the third-party certifier to the concern revising 13 C.F.R. § 127.356(c) to say, “[t]he concern must ensure that all documents necessary to determine its eligibility for certification by an approved certifier are uploaded in https://certify.sba.gov or any successor system.”

Proposal Submission for Pending Applications

Additionally, 13 C.F.R. § 127.504(a) currently allows WOSB/EDWOSB applicants to submit offers on a WOSB/EDWOSB set-aside contract while its WOSB/EDWOSB application is pending so long as (1) the WOSB/EDWOSB submitted a complete application for certification to the SBA or to a third-party certifier; and (2) the WOSB/EDWOSB has not received a negative determination. Yet there is no indication of what is considered a “pending” application, though an application was generally considered “pending” when it was marked on the SBA’s Dynamic Small Business Search as such.

The final rule helps to clarify what is “pending,” adding that “[a]n application is pending upon notification from SBA that the application is deemed complete and has sufficient documentation for full analysis” to the end of 13 C.F.R. § 127.504(a). Once effective, applicants will need to be notified by the SBA that its application is complete and has sufficient documentation. It will be interesting to see how long it takes SBA to notify applicants of a complete application. Especially when you take into consideration the fact that SBA removed the requirement in 13 C.F.R. § 127.304(a) that currently requires the SBA to notify applicants within 15 days of application submission whether their application is complete or not.

Status Protests

Finally, SBA revises its rule on WOSB/EDWOSB status protests creating new notification responsibilities for concerns that upon final determination are determined to not meet the requirements of a WOSB/EDWOSB. A new section, 13 C.F.R. § 127.604(f)(5), will require concerns to notify contracting officers and update its SAM registration in the case of a status protest that results in decertification within two business days of a final determination. Additionally, a concern that self-certifies as a WOSB/EDWOSB following a negative final determination in a status protest “may be in violation of criminal laws, including section 16(d) of the Small Business Act, 15 U.S.C. 645(d).”

As you can see, a number of changes are coming to the WOSB/EDWOSB rules. While most help align the WOSB/EDWOSB rules with the other SBA program rules, some expand SBA’s discretion further (lookin’ at you, pending application rules). Regardless, SBA’s efforts to streamline the various SBA program rules are a good step in the right direction.

Questions about this post? Email us. Need legal assistance, call 785-200-8919.

Looking for the latest government contracting legal news? Sign up for our free monthly newsletter, and follow us on LinkedInTwitter and Facebook.

Read the whole story
tain
1 day ago
reply
Share this story
Delete

British company launches “AI Granny” that talks with scammers to waste their time.

1 Share
British company launches “AI Granny” that talks with scammers to waste their time. submitted by /u/BugOperator to r/nextfuckinglevel
[link] [comments]
Read the whole story
tain
2 days ago
reply
Share this story
Delete

Lmfao FAFO

1 Share
Lmfao FAFO submitted by /u/Skye_hai_bai to r/WhitePeopleTwitter
[link] [comments]
Read the whole story
tain
2 days ago
reply
Share this story
Delete

Beyond Human-in-the-Loop: Managing AI Risks in Nuclear Command-and-Control

1 Share

On Nov. 16, U.S. and Chinese leaders met on the margins of the Asia-Pacific Economic Cooperation summit in Lima, Peru, jointly affirming “the need to maintain human control over the decision to use nuclear weapons.” This declaration echoes a joint document submitted by France, the United Kingdom, and the United States during the Nuclear Nonproliferation Treaty review process in 2022. With countries increasingly prioritizing military applications of AI, integrating AI into nuclear weapons systems is becoming a distinct possibility, especially as nuclear arsenals undergo modernization. While some nuclear-weapon states have emphasized the importance of maintaining human oversight and control over

The post Beyond Human-in-the-Loop: Managing AI Risks in Nuclear Command-and-Control appeared first on War on the Rocks.

Read the whole story
tain
2 days ago
reply
Share this story
Delete

It was never about helping people

1 Share
It was never about helping people submitted by /u/CorleoneBaloney to r/MurderedByWords
[link] [comments]
Read the whole story
tain
2 days ago
reply
Share this story
Delete
Next Page of Stories