Social | Information
Information serves as a conduit for conveying meaningful messages to its recipients. However, it's important to note that information in itself does not inherently possess value. For instance, a library, despite its wealth of information, does not contribute to the Gross Domestic Product (GDP) unless it has active subscribers. The value of information is realized when it is utilized effectively.
Copyright© Schmied Enterprises LLC, 2024.
Link of the day. A link to Beehiiv, a platform for magic. Here.
Link of the day. Hobby Lobby, a craft and home decor store. Here.
Link of the day. A post about next-generation solar panels that are 1000x more powerful. Here.
Link of the day. Xraised, a platform for raising awareness. Here.
Link of the day. Wikipedia page for Jamboard, a digital whiteboard developed by Google. Here.
Link of the day. A sale for Microsoft Windows 11 Pro on StackSocial. Here.
Link of the day. BiblioBoard Library on the Apple App Store. Here.
Link of the day. A WikiHow guide on making a solar cell at home. Here.
Link of the day. TechnoAnt's best-selling collection. Here.
Link of the day. A user list from Blinkist, a book summary service. Here.
Link of the day. Your next AI PC. Here.
Regulatory. News about an Apple employee claiming the company spies on its employees. Here.
Business news often falls under the category of free speech due to this very reason. Similarly, an old hard drive holds no value unless efforts are made to extract valuable data from it, such as Bitcoin. Traders frequently gather publicly available information, but their decisions are based on a selected subset, with the remainder being disregarded.
The source and medium of information also play a significant role. Despite the availability of numerous software on open-source platforms, people often opt for audited and reviewed versions of the same codebase.
At times, information about wealth can prove to be futile. For instance, flaunting a luxury car like a Ferrari in your neighborhood might attract unwanted attention and potential threats. Therefore, reliable entities like lawyers only trust bank statements for accurate information, emphasizing the importance of bank secrecy.
Different groups, such as feudalists, conservatives, and communists, engage in trade based on your wealth. In contrast, capitalists and free marketers focus solely on the product.
Occasionally, the format of information determines its value. For example, the IRS and courts prefer written communication over US mail.
Information plays a crucial role in reducing uncertainty, which is beneficial for businesses. Our analysis suggests that information that decreases the random variance of events adds value by enhancing prediction accuracy, reducing risks and rates, and ultimately lowering insurance costs and interest rates. This attracts more investors, making projects more feasible, cost-effective, and high-volume.
Meteorological data, such as daily sunlight hours, can ensure the cash flow of investments like solar farms. Additional wind data can explain lower efficiency due to dust, attracting institutional investors or investment funds, thereby triggering real growth.
Some information is readily available at minimal or no cost. Wikipedia is a prime example, offering information under a permissive license that can be obtained from many dictionaries or encyclopedias costing $100 or less. As a result, models trained on these data have lower costs due to strong competition.
Timely information that reduces risks may be more expensive to obtain. Western businesses traditionally protect their information through Non-Disclosure Agreements (NDAs). Recent location-specific weather data and solar efficiency values can be obtained, but at a cost. Management consulting firms like McKinsey, Accenture, or IBM can provide such high-cost services. This information is more timely, usually within three years, and hence more expensive than books.
The most expensive information is usually timely, allowing the customer to derive value by acting on it immediately. The most common example is stock exchange trade information, where transactions from the last few seconds can generate real cash flow.
Less known live transactional information includes traffic information, which can reduce delays and improve shipping costs. Even more crucial are commodity and supply chain transactional data. Timely filling of a truck with perishables or a container ship can save millions. It's no surprise that some hackers have gone to the extent of planting IoT on cranes in ports.
Individual information is typically restricted due to potential misuse. Private health, wealth, and location information are usually restricted, especially as the data becomes more recent. For instance, a medication taken a decade ago may not be of interest to an attacker, but a current medication is, as the side effects of interactions can cause real harm.
Legal departments and security officers define the procedures for handling data within companies and provide detailed analysis. Non-Disclosure Agreements usually bind employees and partner companies. It's a good common practice to verify whether information is already public before discussing it outside the workplace.
Public references of articles that appear when searching a company name can easily be discussed. However, NDAs may restrict such discussions. Certain roles and titles like CFO, CEO may not disclose information even if the discussion is about a publicly disclosed news article. A legal officer may not comment on public trials of their own organization. Blackout periods and trading windows of disclosure requirements also restrict information sharing.
It's crucial to check the scope of non-disclosure and non-compete agreements before signing. Most NDAs protect private intellectual property. Crimes, for example, usually fall outside the scope of the NDA. Many lawyers can also question the validity of non-compete agreement terms. Sometimes even an inquiry or job application may require an NDA before any offer. If a non-compete or non-disclosure clause does not have a matching compensation entry, it may require a follow-up with a professional as a due diligence step. Some jurisdictions restrict the length or scope of non-compete agreements. Some jurisdictions may enforce a matching compensation for each term signed. Such context can specify whether any terms are valid, unlawful, or fraudulent.
The most significant progress may be restricting the use of information in the near future. Laws nowadays focus on the companies that collect and store data. It's less common to enforce a due diligence step for companies that obtain data before they use them. Laws focused on governing the use of data with opt-in and opt-out requirements in the early 2000s.
All these factors influence how your model is used. Professional employees of most jobs read about 500 books in a lifetime. The training corpus of most models is much larger than this, posing risks and challenges.
Proprietary information can be identified by the number of sources. If information can be obtained from multiple sources, it can be considered public domain with preconditions. Some professional publications may restrict this by naming the copyrighted source in the references. Some others may post a license clause. These can be eliminated as sources.
Current laws overlook the use of copyrighted information to identify itself. Proprietary copyrighted data may be collected for the purpose of identifying the same copyrighted information in training data to ignore it. The process is still defined as use, and a massive amount of data is required by the use case. Licensing data for identifying copyrighted information in derived data sets may be an opportunity for affordable licensing.
There are tens of billions of dollars of investment flowing into model training. Identifying what is possible was an important first step as a research. Mainstream use will probably be a less exciting but useful step. The author expects that training costs will drop for everyday use models.
Rational use cases will allow us to set clear boundaries on what gets inside those few hundreds of books or 250k pages of corpus that your company uses.