Teaching metrics and contributor docs at Flock 2017

The Fedora Community Operations (CommOps) team held an interactive workshop during the annual Fedora contributor conference, Flock. Flock took place from August 29th to September 1st in Cape Cod, Massachusetts. Justin W. Flory and Sachin Kamath represented the team in the workshop. CommOps spends a lot of time working with metrics and data tools available in Fedora, like fedmsg and datagrepper. Our workshop introduced some of the tools to work with metrics in Fedora and how to use them. With our leftover time, we discussed the role of contributor-focused documentation in the wiki and moving it to a more static place in Fedora documentation.

What does CommOps do?

The beginning of the session introduced the CommOps team and explained our function in the Fedora community. There’s two different skill areas in the CommOps team: one focuses on data analysis and the other focuses on non-technical community work. The motivation for CommOps was explained too. The team’s mission is to bring more heat and light into the project, where light is exposure and awareness, and heat is more activity and contributions. Our work usually follows this mission for either technical or non-technical tasks.

At the beginning of the workshop, metrics were the main discussion point. CommOps helps generate metrics and statistical reports about activity in Fedora. We wanted to talk more about the technical tools we use and how others in the workshop could use them for their own projects in Fedora.

What are Fedora metrics?

fedmsg is the foundation for all metrics in Fedora. fedmsg is a message bus that connects the different applications in Fedora together. All applications and tools used by Fedora contributors emit messages into fedmsg. This includes git commits, Koji build status, Ansible playbook runs, adding a member to a FAS group, and more. Together, the data is meaningless and is difficult to understand. In the #fedora-fedmsg channel on Freenode, you can see all the fedmsg activities in the project (you can see the project “living”!). The valuable part is when you take the data and filter it down into something meaningful.

One of the examples from the workshop was the analysis of FOSDEM and community engagement by CommOps contributor Bee Padalkar. In her report, she determined our approximate impact in the community at FOSDEM. Using Fedora Badges, it revealed how many people we interacted with at FOSDEM and how they engaged with the Fedora community before and after the conference.

The metrics tools in Fedora help make this research possible. One of the primary goals of our workshop was to introduce the metrics tools and how to use them for the audience. We hoped to empower people to build and generate metrics of their own. We also talked about some of the plans by the team to advance use of metrics further.

Introducing the CommOps toolbox

The CommOps toolbox is a valuable resource for the data side of CommOps. Our virtual toolbox is a list of all the metrics and data tools available for use and a short description of how they’re used. You can see the toolbox on the wiki.

Sachin led this part of the workshop and explained some of the most common tools. He introduced what a fedmsg publication looked like and helped explain the structure of the data. Next, he introduced Datagrepper. Datagrepper helps you pull fedmsg data based on a set of filters. With your own filters, you can customize the data you see to make comparisons easier. Complex queries with Datagrepper are powerful and help bring insights into various parts of the project. When used effectively, it provides insight into potential weak spots in a Fedora-related project.

Finally, Sachin also introduced his Google Summer of Code (GSoC) 2016 project, gsoc-stats. gsoc-stats is a special set of pre-defined filters to create contribution profiles for individual contributors. It breaks down where a contributor spends most of their time in the project and what type of work they do. Part of its use was for GSoC student activity measurements, but it has other uses as well.

What is Grimoire Lab?

Sachin is leading progress on a new tool for CommOps called Grimoire Lab. Grimoire Labs is a visual dashboard tool that lets a user create charts, graphs, and visual measurements from a common data source. The vision for Grimoire Lab in Fedora is to build an interactive dashboard based off of fedmsg data. Using the data, anyone could create different gauges and measurements in an easy-to-understand chart or graph. This helps make the fedmsg data more accessible for others in the project to use, without making them write their own code to create graphic measurements.

Most of the time for Grimoire Lab in the workshop was explaining its purpose and expected use. Sachin explained some of the progress made so far to make the tool available in Fedora. This goal is to get it hosted inside of Fedora’s infrastructure next. We hope to deliver on an early preview of this over the next year.

Changing the way we write contributor documentation

The end of our workshop focused on non-technical tasks. We had a few tickets highlighted but left it open to the audience interest to direct the discussion. One of the attendees, Brian Exelbierd, started a discussion about the Fedora Documentation team and some of the changes they’ve made over the last year. Brian introduced AsciiDoc and broke down the workflow that the Docs team uses with the new tooling. After explaining it, the idea came up of hosting contributor-focused information in a Fedora Docs-style project, instead of the wiki.

The two strong benefits of this approach is to keep valuable information updated and to make it easily accessible. Some common wiki pages for the CommOps team came up, like the pages explaining how to join the team and how to get “bootstrapped” in Fedora. After Brian’s explanation of the tools, the new Docs tool chain felt easy to keep up and effective promoting high-value content for contributors out of the wiki. Later during Flock, on Thursday evening, Brian organized a mini workshop to extend this idea further and teach attendees how to port content over.

CommOps hopes to be an early example of a team to use this style of documentation for our contributor-focused content. Once we are comfortable with the set-up and have something to show to others, we want to document how we did and explain how other teams can do it too. We hope to carry this out over the Fedora 27 release cycle.

See you next year!

Flock 2017 was a conference full of energy and excitement. The three-hour workshop was useful and effective for CommOps team members to meet and work out plans for the next few release cycles in the same room. In addition to our own workshop, spending time in other workshops was also valuable for our team members to see what others in Fedora are doing and where they need help.

A special thanks goes out to all the organizing staff, for both the bid process and during the conference. Your hard work helps drive our community forward every year by feeling more like a community of people, in an open source world where we mostly interact and work together over text messaging clients and emails.

We hope to see you next year to show you what we accomplished since last Flock!

Categories: CommOps, Events

8 Comments

  1. The architecture of the project is quite strange, specific and unlikely to benefit from community/network effects.

    It uses a custom Kibana fork (not even bothering to change the naming in documentation), when ELK already indicated it was little interested in Kibana use in other projects, causing the creation of Grafana, which is a vibrant fork used by many apps with a dynamic community.

    It invents custom storage and collection agents. Please, this is just metrology, it does not need a specific engine, choose one of the established FLOSS solutions on the market (prometheus or influxdb if you can live with open core) and contribute data extracters for the tools you want to check. You’ll benefit from all the work already done in those engines (storage, processing, data derivation, api standardisation), you’ll benefit from the tools which are being created around those products, and they’ll benefit from new data sources.

    Together we are strong, alone it is very difficult to attain (and maintain) the critical size needed for long-term success. That’s what SHARE means.

    • Hi Nim, thanks for your feedback. It’s confusing to know which part of the article you’re referring to. I assume Grimoire Lab? This is the first time I’m hearing this feedback. Do you have any links to published articles where we could read more about this? I’m open-minded to discussion and we haven’t set up Grimoire yet. Personally, I have some experience with Grafana and I enjoyed using it in the past. It would helpful if you could link to some articles or other content to help the team make a final decision. Thanks!

      • Yes this is mainly about Grimoire Labs. I’ve been working on a metrology project those past weeks so I found myself doing a cursory evaluation of their stuff when I followed the link to see what they were about.

        http://grimoirelab.github.io/

        How it works …

        Lots of vanity ‘clever’ component naming, which is most annoying for users, but that they will overlook if there is a solid codebase that is worth the pain of talking in puns. Among those

        KiBiter
        Custom fork of Kibana
        Oh, a new Kibana fork I was not aware of, is it interesting

        https://github.com/GrimoireLab/kibiter

        starts with a complete merge of Kibana 6.0.0-alpha2 including proeminent Kibana as project title.

        So this is just copylibing on a giant scale, with enough differences one can’t use ELK directly, not enough to emancipate oneself, no hope of merging back to the root project in the future, and no support by the root project.

        Sooner or later their chosen upstream will make a change that makes rebasing too expensive (because it does not care about them), and they’ll be left with two unpalatable choices: actually fork a codebase that’s too big for their resources, or rewrite a big part of their addon layer. Both options will leave the people that depended on their product in the dust.

        • Thanks for the feedback. We’ll take this into consideration before moving forward with selecting a dashboard option.

        • Hi again, Nin. Thanks for your comments on the problems of following Kibana. I share most of them. Tracking Kibana is not always easy. We know, we’ve been doing that since 4.x. However, I also have to say that they (up to now) have not been the most difficult project in the world to track. Changes in the code base are to some extent predictable, and soft. But yest, as you say, this can change at any point, and of course it is a risk.

          As I commented in my previous answer, you are free to use vanilla Kibana if you prefer: everything will work out of the box, except for some minor details that are relevant only in some scenarios (such as the menu on the left to navigate the dashboards). If you have a look at, for example

          https://symphonyoss.biterg.io/

          You’ll notice how it is almost vanilla Kibana, and the look & feel if you decide to use Kibana instead would be almost the same.

          In fact, the moment Kibana has all the features we need (because they implemented, or because they accepted our pull requests), very likely Kibiter won’t be needed any longer.

          If you’re interested in GrimoireLab, and in analytics on software development data, have a look at the other components. Please let me know if I can help somehow.

          And I’m really sorry that reading our documents about GrimoireLab led you to focus on Kibiter, instead of the rest of the system. Clearly, we have *a lot* of room to improve those documents 🙁

    • Thanks, Nim, for your comments on GrimoireLab. I’m one of the people working on in. Let me contribute with some clarifications:

      * GrimoireLab is intended to retrieve, analyze and visualize information about software development. It does not intend to be a general toolset for ETL, nor for visualization, nor something else.
      * That’s why we use a part of the ELK (ElasticSearch and Kibana) stack: we don’t intend to reinvent the wheel. ElasticSearch suits our needs for continuous nonSQL data storage. Kibana has almost everything we need to visualize data in an actionable dashboard.
      * GrimoireLab works perfectly well with vanilla Kibana. If you want, just go that way. We did a soft-fork of Kibana (Kibiter) just because we missed some functionality. We’re contributing it upstream as much as we can. We’re also building (and contributing) new visualizations to it, which hopefully will be useful for the whole Kibana community.
      * The main components of GrimoireLab are Perceval, Arthur, GrimoreELK, SortingHat. You can have a look at the architecture at:

      https://grimoirelab.gitbooks.io/training/grimoirelab/intro/components.html

      with scenarios and use cases at:

      https://grimoirelab.gitbooks.io/training/grimoirelab/intro/scenarios.html

      * Everything (except for Kibiter) is Python3, and ready to use via pypi packages:

      https://grimoirelab.gitbooks.io/training/before-you-start/installing-grimoirelab.html

      * Many components in GrimoireLab are intended to be used in combination with other tools. For example, there is nothing preventing you from using Grafana for visualizations if that suits you better. It is only a matter of connecting it to the ElasticSearch instance with your data. In fact, I would love seeing somebody trying Grafana with GrimoireLab: let me know if you happen to be interested, I would be more than happy to help.

      On your final sentence, I completely share it. Together, we’re strong. That’s why GrimoireLab is designed to work with other components (for example, you can just feed Pandas dataframes, and then do all the analytics with Pandas in Jupyter notebooks), and reuses a lot of FOSS components itself. Kibana is just one of them.

  2. It is untrue we have translation statistics, we’ve been waiting for it for more than a year now.
    Please remove the reference or explicitly write it as a long awaited wish, Zanata hosting is obviously really hard for nor Red Hat nor infra team finding a solution to upgrade http://fedora.zanata.org to a newer version…

    • Hi! When I mentioned the Zanata translation metrics, I linked to this GitHub repo for the fedmsg2zanata hook. I know it’s not completely finished yet too because I remember talking about this at Flock 2016 too. 🙂 Hopefully we’ll see it completed soon, but for now, I thought it was okay to link to the GitHub repository.

Leave a Reply

Your email address will not be published.

*

Copyright © 2017 Fedora Community Blog

Theme by Anders NorenUp ↑

%d bloggers like this: