Github; Contributors: @k3nn.eth#0270, @Stanley#8720, @APersonofNote#3896, @cephalopod#8465, @patman needs no robin#0371

<aside> đź’ˇ L.I.O.N: language informatics on organizational networks

</aside>

Leo

Leo

Overview

The utility of network analysis has never been more clear than in web3. The Ethereum blockchain alone has a daily volume of over a million transactions—all of which are public record—allowing us to draw up edges connecting an ever-expanding network of nodes.

There is much to learn about the nature of web3 through this lens. For DAOs specifically, transaction data can provide insight into on-chain governance operations and degrees of decentralization.

However, many DAO operations occur off-chain. When it comes to generating insights about community-run, internet-native organizations, there is a particular piece of information that is both deeply insightful and stored off-chain: communication.

Organizational Network Analysis [ONA] looks at both network and language layers of an organization. Natural language processing [NLP] tools can be used to extract psychometric properties like engagement, turnover intent, and cultural fit by analyzing the language layer within each edge of the network. [1]

Project Lion is an experiment in taking this idea to the next level. Using DAO communication data, state-of-the-art NLP tools, and novel methodology, we envision a psychometric system that moves beyond topics, lexicons, and word frequencies to develop adaptive community-trained intelligent agents which serve to simulate the communication patterns of complex human systems.

Project Details

Classical NLP tools [e.g., nltk, spacy, gensim, empath, etc.] largely dominate the spheres of ONA work. Project Lion intends to explore the next generation of NLP tools to extend far beyond current capabilities.

By fine-tuning transformer models like GPT2, GPT3 [2], and GPT-J on Discord communication data, we believe a new layer of psychometric insight is waiting to reveal itself in the aggregate voice of a community.

We refer to this concept as the digital twin: a digital copy of a group’s language patterns summed up into a single AI voice.

We hypothesize that:

If these can be validated, Project Lion may change the way large group psychometrics are measured in the practical setting. Today, the most common approach is to use organization-wide surveys to understand the psychometric properties of large groups. If fine-tuned GPT models are sufficiently reflective of the aggregate voice of a community as hypothesized, we may be able to deploy surveys to AI agents rather than humans while maintaining relative accuracy.