-
Notifications
You must be signed in to change notification settings - Fork 29
Design Doc: Skill Relevancy
A given set of skills are more relevant or less relevant for a given job and context. For example, the skill walking
is relevant to both a Crossing Guard
and an Olympic speed walker
but it is a much more important skill for being a successful Olympic speed walker
. Here the notion of importance spans the dimension of relevance, where less important skills are less relevant and more important skills are more relevant. Note that ONET captures, in addition to importance, the notion of level
, which is the degree of mastery required. The notion of importance here does not claim to capture 'level'. It's up for discussion if we should or not.
Under the Skills API, users retrieving a list of skills for a given occupation and/or context (like geographical) need to have a way of ordering the results. Ordering results helps users quickly determine what collection of skills is most important for their purpose.
Additionally, users will typically want a hint as to why a skill is relevant. Skills may be particularly unique to an occupation, may be hard to master and therefore highly economically valued or even have an affinity to a geographic area. Examples include: dental cleaning
for dental occupations as being unique, Intel x64 Assembly
as being particularly hard to master and even elevator repair
as being more concentrated in large metropolitan areas versus rural villages.
Many users want to understand skills in terms of their local geography, so assessing relevance within a geographical context is important as well.
In short, a jumping off pint for goals of skill relevance include:
- Rank ordering a list of skills according to importance/criticality/demand
- Hinting at why a skill is important (geography, uniqueness, challenging)
- Imposing a context on skills relevance: geographical region, uniqueness within occupations, challenging/hard-to-learn. Alternative contexts might be skills demanded by employers, relevance to general critical skills (e.g., will learning this skill also help me learn a critical path skill).
This is a rough sketch of possible approaches to meeting the goals
- Consider the universe of skills, jobs, geography, quarters as a simple undirected graph for now.
- This graph is completely normalized (no private data provider information is directly shared) and shared as a open WDI data asset (?) for researchers, developer partners. This is in contrast to Google's Jobs private ontology.
- The definition of relevance may then be found in any (or a combination thereof) metrics from social network analysis.
- For example, as a starting point, the distribution of in/out edges to a job node, as compared to its job neighborhood, might indicate if a job relevant in terms of forward career progression. This can be placed within context of geography, time, interest, through simple edge and node filtering; Say after restricting nodes to given FIP areas, for the
Vice President
job we find in its neighborhood theCEO
role. Using graph metrics, here we find that aCEO
role has a low output edge to input edge ratio which is a heuristic for a desirable role to transition to within a career. In this case we recommend toVice Presidents
that they considerCEO
roles as a next step and identify critical skills they lack that are found exclusive toCEOs
. - For skills, given the above, I think we need to have some kind of associative context because we don't have finer grained connection between skills. In this case we connect skills to each other through their occupations: if two skills are found in an occupation their edge weight is generally some function of their co-occurrence statistics. We can split this edge into several edges (and meta edges), each capturing quarter and geography. Here geography and occupations induce hyper graph, occupations are standard edges.
- Implementation: Initial attempt requires an adjacency matrix of jobs, skills, FIP codes and quarters. Nodes need to be normalized to
uuids
, etc. - Implementation: For simplicity we could convert the hypergraph to a bipartite graph. We could also operate on the graph directly. See: https://github.com/jinhuang/hyperx, https://www.usenix.org/system/files/conference/hotcloud15/hotcloud15-heintz.pdf for ideas.
- Geographic relevance: frequency of skill occurrence within Metropolitan Statistical Area (MSA), interested in outliers: very high and very low frequency skill occurrences See: Jobs Design Doc: Skill Relevance for an intersection between Jobs and Skill relevance
- Occupational relevance: criticality to an occupation, centrality measure of skill clusters
- Challengingness/hard-to-learn: level of education, effort required to obtain minimal competence in skill, education level of those holding this skill