图的连通性和连通分量

by Evaristo Caraballo

通过Evaristo Caraballo

英语,人口,连通性和露营地 (English, Population, Connectivity and Campsites)

在世界范围内推动使用Free Code Camp的因素 (Factors driving the use of Free Code Camp worldwide)

Free Code Camp offers a coding education that’s open source, free, and accessible. Sounds ideal, right?

Free Code Camp提供了开源,免费且可访问的编码培训。 听起来很理想,对不对?

Actually, there are several areas where it can improve significantly — especially for people outside of the United States.

实际上,它可以在几个方面显着改善-特别是对于美国以外的人。

I recent analyzed Free Code Camp’s open data and found that worldwide adoption of Free Code Camp is affected by several factors. To be fair, these seem to be the same factors that are affecting other end-to-end online programs, and online courses (MOOCs) as well. These factors include the learners’ home country’s wealth, connectivity, population size, English proficiency, and — although less documented — the existence of active offline communities.

我最近分析了Free Code Camp的开放数据,发现Free Code Camp的全球采用受到几个因素的影响。 公平地说,这些因素似乎与影响其他端到端在线程序以及在线课程(MOOC)的因素相同。 这些因素包括学习者的祖国的财富,连通性,人口规模,英语水平,以及(尽管文献记载较少)活跃的离线社区的存在。

You can tell from reading social media posts by campers — Free Code Camp’s community members — that English proficiency and socialization affect how useful Free Code Camp is to a given camper. This article will explore these, along with less obvious factors.

通过阅读露营者的社交媒体帖子(Free Code Camp的社区成员),您可以知道英语水平和社交影响了Free Code Camp对给定露营者的有用程度。 本文将探讨这些以及不太明显的因素。

To get an approximation of how geography affects the usefulness of Free Code Camp, I looked for differences in the numbers of sessions (from Google Analytics) for regions and countries, basing the comparison on relevant demographics.

为了大致了解地理因素如何影响Free Code Camp的有用性,我根据相关的人口统计数据,查找了地区和国家(来自Google Analytics(分析)) 的会议次数差异。

I started by broadly comparing absolute number of sessions between sub-continental regions. It only takes a glance at this map to realize that the adoption of the Free Code Camp in Africa, Central Asia, and the smallest Pacific Islands is very much behind the rest of the regions.

我首先大致比较了次大陆地区之间的会议绝对数量。 只需要看一下这张地图就可以了解到,在非洲,中亚和最小的太平洋岛屿采用免费代码营地的情况远远落后于其他地区。

This is a sign that having a healthy economy or belonging to its periphery could be one relevant factor affecting the adoption of the program. In fact, country wealth could be strongly related to program adoption, as we can see by exploring some demographics of the Top 20 countries, sessions-wise.

这表明经济健康或处于外围可能是影响该计划采用的一个相关因素。 实际上,国家财富可能与计划的采用密切相关,正如我们通过逐届研究前20个国家/地区的某些人口统计数据所看到的那样。

The table below also includes some of the wealthiest nations, containing 16 out of the 79 countries with over-average GDP per capita (World GDP per capita (2015) = Int’l $14,982, wikipedia at Jan 2016).

下表还包括一些最富有的国家/地区,其中79个国家中有16个国家的人均GDP高于平均水平(世界人均GDP(2015年)= 14,982美元, 2016年1月为维基百科 )。

Still, having a healthy economy is not enough to explain the table. A closer look suggests that the size of the internet population is a determinant factor of influence: The list represents the 60% of the total population of the world, but more importantly, it represents the 68% of the world internet population.

尽管如此,经济发展还不足以解释这个表。 仔细研究表明, 互联网人口规模是影响因素的决定因素:该列表代表了世界总人口的60%,但更重要的是,它代表了68% 世界 互联网人口的百分比。

OK, so far we’ve found some evidence that economic wealth and (internet) population size affect a given person’s likelihood of joining the Free Code Camp community. What are some other factors?

好的,到目前为止,我们已经找到了一些证据,证明经济财富和(互联网)人口规模会影响特定人加入“免费代码营”社区的可能性。 还有哪些其他因素?

One factor seems to be English proficiency. The other seems to be the presence of campsites — city-based chapters of Free Code Camp where campers meet up and code together.

一个因素似乎是英语水平。 另一个似乎是露营地的存在-自由代码营的城市部分,营员们聚在一起聚会并一起编码。

To find out, we must look at the data in relative terms and get rid of exceptional records in order to unveil their impact.

要找出答案,我们必须相对看待数据并摆脱特殊记录,以揭示其影响。

For this step, I filtered out large countries like the United States, Canada, the United Kingdom, India and China as well as compared only countries with complete data.

在此步骤中,我过滤掉了美国,加拿大,英国,印度和中国等大国,并仅比较了具有完整数据的国家。

I also recalculated the number of sessions and number of campsites as controlled by internet population size, so they show a relative trend.

我还重新计算了互联网人口规模控制的会议次数和露营地数量,因此它们显示出相对趋势。

Instead of a simple table I relied on a scatter plot of the modified numbers of campsites and sessions, with a couple of additional features:

我不是使用简单的表格,而是依靠散点图来显示经过修改的露营地和会议的数量,并具有几个附加功能:

  • the size of the point represented the comparable size of the internet population at each country (large if over average, small if under average)

    点的大小表示每个国家/地区的互联网人口的可比较大小 (如果超过平均水平则为大,如果低于平均水平则为小)

  • the color of the point indicates English proficiency (based on Education First’s English Proficiency Index) for each selected country — purple if above average, green if below average

    点的颜色表示每个所选国家/地区的英语熟练 程度 (基于教育优先英语水平指数 )-高于平均水平时为紫色,低于平均水平时为绿色

The chart above reveals several insights simultaneously:

上图同时揭示了一些见解:

  1. The higher the number of campsites, the higher the relative amount of sessions. This effect is particularly glaring between countries with larger internet populations

    露营地的数量越多,相对次数就越多。 在互联网人口众多的国家之间,这种影响尤为明显

  2. Countries with higher English proficiency (purple dots) could be considered as more active in relative terms than those countries with lower English proficiency (green dots), no matter the size of the internet population of the country

    相对于英语水平较低的国家(绿点),英语水平较高(紫色点)的国家相对而言可以被认为是更活跃的国家,无论该国的互联网人口规模如何

So, not only is a given country’s wealth and internet population size important — English proficiency and an active offline community also seem to affect the diffusion of Free Code Camp in a given country.

因此,不仅给定国家的财富和互联网人口规模很重要- 英语能力活跃的离线社区似乎也影响了Free Code Camp在给定国家的传播。

In summary:

综上所述:

  1. At a high level, richer regions are those which are having the most of the sessions; our campers are largely coming from countries from those regions, such as the US, Canada or several European countries.

    较高的层次上 ,较富裕的区域是指那些会议最多的区域。 我们的露营者主要来自这些地区的国家,例如美国,加拿大或几个欧洲国家。

  2. However, by comparing countries independently we can affirm that the existence of large internet population is actually very relevant when we talk about absolute numbers. Some examples of this are India, Brazil, Russia, the Philippines and probably China (we capture only a fraction of sessions there since The Great Firewall blocks Google Analytics for all non-VPN traffic).

    但是,通过独立地比较国家/地区,我们可以肯定的是,当我们谈论绝对数字时,庞大的互联网人口实际上是非常相关的。 其中的一些例子是印度,巴西,俄罗斯,菲律宾,可能还有中国(自从“大防火墙”阻止所有非VPN流量的Google Analytics(分析)以来,我们仅捕获了一部分会话)。

  3. Countries with a high English proficiency more widely adopt Free Code Camp, although to see this we needed to control by internet population size.

    英语水平较高的国家更广泛地采用“免费代码营”,尽管要看到这一点,我们需要通过互联网人口规模进行控制

  4. Finally, if you control for internet population size, you can see that the number of campsites seems to be related to the number of sessions, suggesting that coding together in person increases campers’ activity on Free Code Camp’s website.

    最后,如果您控制互联网人口的规模,您会发现露营地的数量似乎与会话次数有关,这表明亲自编码可增加露营者在Free Code Camp网站上的活动。

那么,Free Code Camp如何处理所有这一切? (So what is Free Code Camp doing about all of this?)

Even before this analysis, our community has been taking actions to reduce linguistic barriers to adoption. A small army of campers have been voluntarily translating Free Code Camp’s open source curriculum, wiki, and other instructional resources into different world languages.

甚至在进行此分析之前,我们的社区就已开始采取行动减少采用语言的障碍。 一小撮露营者已经自愿将Free Code Camp的开源课程,维基和其他教学资源翻译成不同的世界语言。

Vladimir Tamara, a core team member in Bogotá, Colombia, has already overseen the curriculum’s translation into Spanish. He’s now coordinating the translation effort for other world languages, and helping write the code that will handle language options.

哥伦比亚波哥大的核心团队成员弗拉基米尔·塔玛拉(Vladimir Tamara)已经监督了课程的西班牙语翻译。 他现在正在协调其他世界语言的翻译工作,并帮助编写处理语言选项的代码。

In an effort to reduce the impact of poor connectivity and the large number of campers who use smart phones as their primary — or only — internet device, Free Code Camp is continually improving the mobile experience. We’re also working on an offline mode for campers who lack stable internet access and electricity.

为了减少连接不良和大量使用智能手机作为主要(或唯一)互联网设备的露营者的影响,免费代码营正在不断改善移动体验。 我们还为缺乏稳定互联网访问和电力的露营者开发了离线模式。

One interesting trend that emerged from my analysis is the relationship between number of sessions and number of campsites in a given country. These in-person groups my serve to attract and involve campers who would otherwise not have the initiative to stick with a challenging program like Free Code Camp.

我的分析得出了一个有趣的趋势,即给定国家/地区的学期数与露营地数之间的关系。 我亲自参加这些小组的活动是为了吸引并吸引营员,否则他们将没有主动坚持像Free Code Camp这样具有挑战性的计划的意愿。

Justin Richardsson, a visual designer in Toronto, Canada, recently joined our core team to focus on campsites. He has already organized many coding events through the Toronto campsite. His goal is to learn from other campsite leaders and distribute their knowledge to campsites worldwide.

加拿大多伦多的视觉设计师Justin Richardsson最近加入了我们的核心团队,专注于露营地。 他已经通过多伦多营地组织了许多编码活动。 他的目标是向其他营地负责人学习,并将他们的知识传播到全世界的营地。

I’m also working on related visualizations at bl.ocks.org/evaristoc.

我还在bl.ocks.org/evaristoc上进行相关的可视化工作。

This analysis just scratches the surface of what we can learn from Free Code Camp’s open data. Join our Data Science chat room and help us make sense of all these data.

这种分析只是从表面上我们可以从Free Code Camp的开放数据中学到什么。 加入我们的数据科学聊天室 ,帮助我们理解所有这些数据。

翻译自: https://www.freecodecamp.org/news/english-size-connectivity-and-campsites-factors-driving-the-use-of-free-code-camp-worldwide-3c9d4e2b8c17/

图的连通性和连通分量