The Digital Companies Act (DSA) transparency database, whereas proving to be fairly ineffective for misinformation or hate speech researchers, may be very enlightening on copyright moderation. Platform governance researchers have lengthy suspected that YouTube is essentially the most closely moderated platform on copyright points, and we now have concrete proof of this. YouTube, thus far, in keeping with the DSA transparency database has submitted over 946 000 content material moderation selections that includes mental property infringement in 30 days previous to July 27, 2024. That is greater than some other VLOPs at present featured within the DSA database. The graph under illustrates the proportion of the content material moderation selections taken in at some point based mostly on mental property infringements that had been taken by YouTube. (One Day in Content material Moderation Report by PGMT lab is downloadable right here).
Nevertheless, analysis performed below the HORIZON Europe mission ReCreating Europe, and not too long ago printed within the Coverage & Web journal, means that YouTube’s disclosed figures might below characterize the precise extent of copyright moderation. Our research targeted on the consequences of Article 17 of the EU Copyright within the Digital Single Market Directive (CDSMD) on copyright moderation practices, evaluating Germany and France—two EU member states of comparable measurement and inhabitants however differing of their timing of CDSMD implementation.
The primary a part of the research presents findings on copyright takedowns on YouTube in Germany and France between 2019 and 2022. To acquire these findings, we examined a subset of movies from the most important YouTube research (performed by Rieder et al. in 2020) earlier than assessing, first, whether or not they had been eliminated by YouTube and, second, how the removals had been associated to the international locations in query, the video classes, and different predictors (corresponding to likes and engagement). The primary knowledge set contains a pattern of 4,000,000 movies from EU-based YouTube channels. Based mostly on this, we collected a second knowledge set in 2022 after CDSMD implementation and filtered by international locations (Germany and France). This second knowledge set is a 2.09% subsample of the unique knowledge set, leading to 83,676 movies. This allowed us to check the incidence of video removals in each international locations and study the explanations supplied by YouTube for these takedowns, corresponding to copyright infringement complaints and account deletions.
Our findings point out a major underreporting of copyright-related takedowns, as many removals had been attributed to “unknown causes” or “deletion of related YouTube account,” which we argue are sometimes not directly associated to copyright moderation. This conclusion is supported by earlier scholarship (e.g. Kaye & Grey, 2021) and YouTube’s personal copyright enforcement mechanisms (YouTube Assist, 2023), which might obscure the true causes for content material elimination. In sum, this primary evaluation on the content material degree clearly exhibits large underreporting of copyright-related takedowns by YouTube within the embedded info of eliminated movies. As we’ve got proven, the label “Unavailability because of Copyright Infringement Grievance” shouldn’t be an excellent proxy for assessing results of copyright content material moderation. There’s a sturdy indication that lots of the removals in each the “unknown causes” and the “account deleted” classes are literally copyright-related.
To additional assess the relation of copyright content material moderation to removals for this underdefined content material, we’ve got developed extra statistical measures. For this, we used a random forest mannequin to evaluate every variable that was necessary for takedowns. The mannequin used variables derived from the video metadata supplied by the YouTube API v3.
In our mannequin, class ID was revealed as an important predictor of movies being taken down for an “unknown motive.” Constructing on this outcome, we recognized these classes which might be most vulnerable to copyright enforcement from the prevailing literature (e.g. City et al., 2017).
In consequence, we argue that along with these movies earmarked as “Unavailability because of Copyright Infringement Grievance” it’s cheap so as to add movies with labels “unknown” or “account deleted” if and provided that they’ve content material classes related to movie, music gaming, sports activities and leisure. In conclusion, our greatest effort estimate after this multistep evaluation is that 2.17% of movies in our pattern might have been taken down, each by the platform and by the customers themselves, because of copyright content material moderation.
This outcome sits proper in the course of present scholarship on copyright-related takedown charges. Whereas in Grey and Suzor’s (2020) research solely roughly 1% of all uploaded movies had been eliminated because of obvious copyright violations, an evaluation by Erickson and Kretschmer (2018) of movies extremely prone to takedowns, corresponding to parodies, revealed with 15.5% a far larger share of takedowns that is likely to be copyright-related. The truth that Erickson and Kretschmer have this excessive estimate may not be stunning: with parody, they targeted on content material that’s extremely prone to copyright takedown. The distinction between the estimate by Grey and Suzor and our personal is likely to be associated to rising strain and exterior regulation of platforms such because the CDSMD.
Based mostly on this best-effort evaluation of the function and scope of copyright content material moderation in takedowns, we then in contrast the findings for the 2 totally different international locations within the research (Germany and France) to check for potential early results of CDSMD implementation on copyright content material moderation.
The outcomes present exceptional variations between Germany and France. In France, there have been extra takedowns normally with 3,410 takedowns compared with 2,901 in Germany. But the relative share of copyright-related takedowns is way larger in Germany with nearly two-thirds of takedowns (64.19%) being copyright-related compared with solely a bit over a 3rd (39.62%) in France.
To contextualize these outcomes, you will need to be aware that nationwide copyright regimes have at all times differed between France and Germany, so the implementation of CDSMD with regard to timing and substance shouldn’t be their solely distinction—however normally, the copyright regimes in France and Germany had been harmonized on a fairly excessive degree already earlier than the CDSMD (Sganga et al., 2023). Article 17 of the CDSMD, as highlighted by authorized students (Husovec & Quintais, 2021), shouldn’t be merely a “clarification” of the prevailing regulation, but it surely adjustments the regulation in basic methods. So, whereas longstanding variations within the copyright regimes of the 2 international locations would possibly trigger totally different blocking behaviors, we think about the early German implementation of the CDSMD a extra believable clarification for the noticed variations.