Skip to content

Commit

Permalink
feat: internal media proxy (DIYgod#14593)
Browse files Browse the repository at this point in the history
* feat: image proxy

* feat: use Map

* docs: add docs

* fix: change config name

* Add lightnovel.us as a media source

* fix: break hostname extraction

* feat: add psl for more accurate domain extraction

* feat: split referer map

* perf(deps): replace `psl` with `tldts`

* fix: add indienova.com to referer map
  • Loading branch information
TonyRL authored Feb 29, 2024
1 parent e41ed0c commit cbbd829
Show file tree
Hide file tree
Showing 10 changed files with 100 additions and 7 deletions.
1 change: 1 addition & 0 deletions lib/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ const calculateValue = () => {
allow_user_hotlink_template: envs.ALLOW_USER_HOTLINK_TEMPLATE === 'true',
filter_regex_engine: envs.FILTER_REGEX_ENGINE || 're2',
allow_user_supply_unsafe_domain: envs.ALLOW_USER_SUPPLY_UNSAFE_DOMAIN === 'true',
mediaProxyKey: envs.MEDIA_PROXY_KEY,
},
suffix: envs.SUFFIX,
titleLengthLimit: Number.parseInt(envs.TITLE_LENGTH_LIMIT) || 150,
Expand Down
1 change: 1 addition & 0 deletions lib/v2/rsshub/maintainer.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
module.exports = {
'/m/:key/:url': ['TonyRL'],
'/routes/:lang?': ['DIYgod'],
'/transform/html/:url/:routeParams': ['ttttmr'],
'/transform/json/:url/:routeParams': ['ttttmr'],
Expand Down
51 changes: 51 additions & 0 deletions lib/v2/rsshub/media.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
const got = require('@/utils/got');
const config = require('@/config').value;
const { getDomain } = require('tldts');
const { refererMap } = require('./referer-map');

module.exports = async (ctx) => {
if (!config.feature.mediaProxyKey) {
ctx.throw(403, 'Internal media proxy is disabled.');
}

const { key } = ctx.params;
if (key !== config.feature.mediaProxyKey) {
ctx.throw(401, 'Invalid media proxy key.');
}

const url = decodeURIComponent(ctx.params.url);
const requestUrl = new URL(url);
const { hostname, origin } = requestUrl;

const domain = getDomain(hostname);

let referer = refererMap.get(domain);
referer ||= origin;

const { headers } = await got.head(url, {
headers: {
referer,
},
});

const cacheControl = headers['cache-control'];
const contentType = headers['content-type'];
const contentLength = headers['content-length'];

if (!contentType.startsWith('image/') || headers.server === 'RSSHub') {
return ctx.redirect(url);
}

ctx.set({
'cache-control': cacheControl || `public, max-age=${config.cache.contentExpire}`,
'content-length': contentLength,
'content-type': contentType,
server: 'RSSHub',
});

ctx.body = await got.stream(url, {
headers: {
referer,
},
});
};
16 changes: 16 additions & 0 deletions lib/v2/rsshub/referer-map.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
const refererMap = new Map([
['fbcdn.net', 'https://www.facebook.com/'],
['cdninstagram.com', 'https://www.instagram.com/'],
['moyu.im', 'https://jandan.net/'],
['lightnovel.us', 'https://www.lightnovel.us/'],
['indienova.com', 'https://indienova.com/'],
['pximg.net', 'https://www.pixiv.net/'],
['me8gs.app', 'https://www.sehuatang.net/'],
['rxn30.app', 'https://www.sehuatang.net/'],
['sex.com', 'https://www.sex.com/'],
['sinaimg.cn', 'https://weibo.com/'],
]);

module.exports = {
refererMap,
};
1 change: 1 addition & 0 deletions lib/v2/rsshub/router.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
module.exports = (router) => {
router.get('/m/:key/:url', require('./media'));
router.get('/routes/:lang?', require('./routes'));
router.get('/transform/html/:url/:routeParams', require('./transform/html'));
router.get('/transform/json/:url/:routeParams', require('./transform/json'));
Expand Down
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,7 @@
"source-map": "0.7.4",
"tiny-async-pool": "2.1.0",
"title": "3.5.3",
"tldts": "6.1.1",
"tough-cookie": "4.1.3",
"twitter-api-v2": "1.16.0",
"uuid": "9.0.1",
Expand Down
14 changes: 14 additions & 0 deletions pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion website/docs/install/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ It is also valid to contain route parameters, e.g. `/weibo/user/2612249974`.

## Features

:::tip Experimental features
:::tip[Experimental features]

Configs in this sections are in beta stage, and **are turn off by default**. Please read corresponded description and turn on if necessary.

Expand All @@ -221,6 +221,8 @@ Configs in this sections are in beta stage, and **are turn off by default**. Ple

`ALLOW_USER_SUPPLY_UNSAFE_DOMAIN`: allow users to provide a domain as a parameter to routes that are not in their allow list, respectively. Public instances are suggested to leave this value default, as it may lead to [Server-Side Request Forgery (SSRF)](https://owasp.org/www-community/attacks/Server_Side_Request_Forgery)

`MEDIA_PROXY_KEY`: the access key for internal media proxy.

## Other Application Configurations

`DISALLOW_ROBOT`: prevent indexing by search engine, default to enable, set false or 0 to disable
Expand Down
14 changes: 9 additions & 5 deletions website/docs/routes/other.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -440,11 +440,15 @@ It is recommended to use with clipping tools such as Notion Web Clipper.
| all | development | design | operation | product | other | marketing | sales |
</Route>

## Transformation {#transformation}
## RSSHub {#rsshub}

Pass URL and transformation rules to convert HTML/JSON into RSS.
### Internal Media Proxy {#rsshub-internal-media-proxy}

<Route author="TonyRL" example="/rsshub/m/key/https%3A%2F%2Fdocs.rsshub.app%2Fimg%2Flogo.png" path="/rsshub/m/:key/:url" paramsDesc={['Media Proxy Key', '`encodeURIComponent`ed URL address']} configRequired="1" />

### HTML {#transformation-html}
### Transformation - HTML {#rsshub-transformation-html}

Pass URL and transformation rules to convert HTML/JSON into RSS.

Specify options (in the format of query string) in parameter `routeParams` parameter to extract data from HTML.

Expand Down Expand Up @@ -476,7 +480,7 @@ Specify options (in the format of query string) in parameter `routeParams` param
| `item` | `div[class='post-content'] p a` |
</Route>

### JSON {#transformation-json}
### Transformation - JSON {#rsshub-transformation-json}

Specify options (in the format of query string) in parameter `routeParams` parameter to extract data from JSON.

Expand Down Expand Up @@ -511,7 +515,7 @@ JSON Path only supports format like `a.b.c`. if you need to access arrays, like
| `itemDesc` | `body` |
</Route>

### Sitemap {#transformation-sitemap}
### Transformation - Sitemap {#rsshub-transformation-sitemap}

Specify options (in the format of query string) in parameter `routeParams` parameter to extract data from Sitemap. (Follows Sitemap Protocol 0.9)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ RSSHub 支持使用访问密钥 / 码,允许清单和拒绝清单三种方式

## 功能特性

:::tip 测试特性
:::tip[测试特性]

这个板块控制的是一些新特性的选项,他们都是**默认关闭**的。如果有需要请阅读对应说明后按需开启

Expand All @@ -221,6 +221,8 @@ RSSHub 支持使用访问密钥 / 码,允许清单和拒绝清单三种方式

`ALLOW_USER_SUPPLY_UNSAFE_DOMAIN`: 允许用户为路由提供域名作为参数。建议公共实例不要调整此选项,开启后可能会导致 [服务端请求伪造(SSRF)](https://owasp.org/www-community/attacks/Server_Side_Request_Forgery)

`MEDIA_PROXY_KEY`: 内置多媒体代理的访问密钥

## 其他应用配置

`DISALLOW_ROBOT`: 阻止搜索引擎收录,默认开启,设置 false 或 0 关闭
Expand Down

0 comments on commit cbbd829

Please sign in to comment.