Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
6ea0014
feat: add development? env check
gildesmarais Aug 17, 2024
1ffeb50
feat: expose endpoint which calls H2r.auto_source
gildesmarais Aug 17, 2024
24734c7
feat: use ssrf_filter to request user provided urls
gildesmarais Aug 17, 2024
0070399
feat: use hash_branch_view_dir
gildesmarais Aug 17, 2024
7588413
feat: adapt to changed auto_source constructor signature
gildesmarais Aug 18, 2024
b2da912
feat: add auto_source#index
gildesmarais Aug 19, 2024
1d6fd45
feat: use content-type constant
gildesmarais Aug 19, 2024
8d98dff
feat: rack-timeout only when RACK_ENV!=development
gildesmarais Aug 19, 2024
b2fedc8
fix: rendering of iframe prevented
gildesmarais Aug 19, 2024
97a8556
refactor: improve auto_source form handling and styles
gildesmarais Aug 19, 2024
f577f1b
feat(auto_source): automatically submit form when ?url= is present
gildesmarais Aug 19, 2024
621a154
feat(auto_source): integrate content_for plugin and enhance UX
gildesmarais Aug 20, 2024
d9a076d
fix(auto_source): update index.erb to improve input validation and UX
gildesmarais Oct 2, 2024
98a3ea0
feat: assert request sources from allowed host/origin
gildesmarais Oct 8, 2024
f78f44a
docs(readme): add instructions for auto_source usage
gildesmarais Oct 8, 2024
b95cdab
feat: use smooth scroll-behaviour
gildesmarais Oct 8, 2024
bfa6791
chore(deps): depend on minimum html2rss version of v0.14
gildesmarais Oct 8, 2024
c4ebb73
feat(auto_source): add specs and re-structure files
gildesmarais Oct 20, 2024
ba5da64
feat(auto_source): add bookmarklet css and style js
gildesmarais Oct 20, 2024
67275bb
test: setup climate_control to safely modify env variables
gildesmarais Oct 20, 2024
e0ffd1e
test(auto_source): add specs for helpers
gildesmarais Oct 20, 2024
322b365
test: add specs for app
gildesmarais Oct 20, 2024
8f87efa
test: tell simplecov to ignore files in config/
gildesmarais Oct 20, 2024
28794c3
test(auto_source): add spec for routes
gildesmarais Oct 20, 2024
397db79
feat(auto_source): tweaks/adjustments on ui
gildesmarais Oct 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,21 @@ source 'https://rubygems.org'

git_source(:github) { |repo_name| "https://github.com/#{repo_name}" }

gem 'html2rss'
gem 'html2rss', '~> 0.14'
gem 'html2rss-configs', github: 'html2rss/html2rss-configs'

# Use these instead of the two above (uncomment them) when developing locally:
# gem 'html2rss', path: '../html2rss'
# gem 'html2rss-configs', path: '../html2rss-configs'

gem 'base64'
gem 'erubi'
gem 'parallel'
gem 'rack-cache'
gem 'rack-timeout'
gem 'rack-unreloader'
gem 'roda'
gem 'ssrf_filter'
gem 'tilt'

gem 'puma', require: false
Expand All @@ -33,7 +35,10 @@ group :development do
end

group :test do
gem 'climate_control'
gem 'rack-test'
gem 'rspec'
gem 'simplecov', require: false
gem 'vcr'
gem 'webmock'
end
21 changes: 20 additions & 1 deletion Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,14 @@ GEM
addressable (2.8.7)
public_suffix (>= 2.0.2, < 7.0)
ast (2.4.2)
base64 (0.2.0)
bigdecimal (3.1.8)
byebug (11.1.3)
climate_control (1.2.0)
concurrent-ruby (1.3.4)
crack (1.0.0)
bigdecimal
rexml
crass (1.0.6)
diff-lcs (1.5.1)
docile (1.4.1)
Expand All @@ -25,6 +31,7 @@ GEM
faraday (>= 1, < 3)
faraday-net_http (3.3.0)
net-http
hashdiff (1.1.1)
html2rss (0.14.0)
addressable (~> 2.7)
faraday (> 2.0.1, < 3.0)
Expand Down Expand Up @@ -75,6 +82,8 @@ GEM
rack (3.1.7)
rack-cache (1.17.0)
rack (>= 0.4)
rack-test (2.1.0)
rack (>= 1.3)
rack-timeout (0.7.0)
rack-unreloader (2.1.0)
rainbow (3.1.1)
Expand Down Expand Up @@ -132,13 +141,18 @@ GEM
simplecov_json_formatter (~> 0.1)
simplecov-html (0.12.3)
simplecov_json_formatter (0.1.4)
ssrf_filter (1.1.2)
thor (1.3.2)
tilt (2.4.0)
tzinfo (2.0.6)
concurrent-ruby (~> 1.0)
unicode-display_width (2.5.0)
uri (0.13.1)
vcr (6.2.0)
webmock (3.24.0)
addressable (>= 2.8.0)
crack (>= 0.3.2)
hashdiff (>= 0.4.0, < 2.0.0)
yard (0.9.36)
zeitwerk (2.6.18)

Expand All @@ -151,13 +165,16 @@ PLATFORMS
x86_64-linux

DEPENDENCIES
base64
byebug
climate_control
erubi
html2rss
html2rss (~> 0.14)
html2rss-configs!
parallel
puma
rack-cache
rack-test
rack-timeout
rack-unreloader
rake
Expand All @@ -169,8 +186,10 @@ DEPENDENCIES
rubocop-rspec
rubocop-thread_safety
simplecov
ssrf_filter
tilt
vcr
webmock
yard

BUNDLED WITH
Expand Down
66 changes: 53 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,16 @@ services:
target: /app/config/feeds.yml
read_only: true
environment:
- RACK_ENV=production
- HEALTH_CHECK_USERNAME=health
- HEALTH_CHECK_PASSWORD=please-set-YOUR-OWN-veeeeeery-l0ng-aNd-h4rd-to-gue55-Passw0rd!
RACK_ENV: production
HEALTH_CHECK_USERNAME: health
HEALTH_CHECK_PASSWORD: please-set-YOUR-OWN-veeeeeery-l0ng-aNd-h4rd-to-gue55-Passw0rd!
# AUTO_SOURCE_ENABLED: true
# AUTO_SOURCE_USERNAME: foobar
# AUTO_SOURCE_PASSWORD: A-Unique-And-Long-Password-For-Your-Own-Instance
## to allow just requests originating from the local host
# AUTO_SOURCE_ALLOWED_ORIGINS: 127.0.0.1:3000
## to allow multiple origins, seperate those via comma:
# AUTO_SOURCE_ALLOWED_ORIGINS: example.com,h2r.host.tld
watchtower:
image: containrrr/watchtower
volumes:
Expand All @@ -66,6 +73,31 @@ The [watchtower](https://containrrr.dev/watchtower/) service automatically pulls

The `docker-compose.yml` above contains a service description for watchtower.

## How to use automatic feed generation

> [!NOTE]
> This feature is disabled by default.

To enable the `auto_source` feature, comment in the env variables in the `docker-compose.yml` file from above and change the values accordingly:

```yaml
environment:
## … snip ✁
AUTO_SOURCE_ENABLED: true
AUTO_SOURCE_USERNAME: foobar
AUTO_SOURCE_PASSWORD: A-Unique-And-Long-Password-For-Your-Own-Instance
## to allow just requests originating from the local host
AUTO_SOURCE_ALLOWED_ORIGINS: 127.0.0.1:3000
## to allow multiple origins, seperate those via comma:
# AUTO_SOURCE_ALLOWED_ORIGINS: example.com,h2r.host.tld
## … snap ✃
```

Restart the container and open <http://127.0.0.1:3000/auto_source>.
When asked, enter your username and password.

Then enter the URL of a website and click on the _Generate_ button.

## How to use the included configs

html2rss-web comes with many feed configs out of the box. [See the file list of all configs.](https://github.com/html2rss/html2rss-configs/tree/master/lib/html2rss/configs)
Expand All @@ -85,7 +117,7 @@ To build your own RSS feed, you need to create a _feed config_.\
That _feed config_ goes into the file `feeds.yml`.\
Check out the [`example` feed config](https://github.com/html2rss/html2rss-web/blob/master/config/feeds.yml#L9).

Please refer to [html2rss' README for a description of _the feed config and its options_](https://github.com/html2rss/html2rss#the-feed-config-and-its-options). html2rss-web is just a small web application that depends on html2rss.
Please refer to [html2rss' README for a description of _the feed config and its options_](https://github.com/html2rss/html2rss#the-feed-config-and-its-options). html2rss-web is just a small web application that builds on html2rss.

## Versioning and releases

Expand All @@ -112,15 +144,23 @@ If you're going to host a public instance, _please, please, please_:

### Supported ENV variables

| Name | Description |
| ------------------------------ | -------------------------------- |
| `PORT` | default: 3000 |
| `RACK_ENV` | default: 'development' |
| `RACK_TIMEOUT_SERVICE_TIMEOUT` | default: 15 |
| `WEB_CONCURRENCY` | default: 2 |
| `WEB_MAX_THREADS` | default: 5 |
| `HEALTH_CHECK_USERNAME` | default: auto-generated on start |
| `HEALTH_CHECK_PASSWORD` | default: auto-generated on start |
| Name | Description |
| ------------------------------ | ---------------------------------- |
| `BASE_URL` | default: '<http://localhost:3000>' |
| `LOG_LEVEL` | default: 'warn' |
| `HEALTH_CHECK_USERNAME` | default: auto-generated on start |
| `HEALTH_CHECK_PASSWORD` | default: auto-generated on start |
| | |
| `AUTO_SOURCE_ENABLED` | default: false |
| `AUTO_SOURCE_USERNAME | no default |
| `AUTO_SOURCE_PASSWORD | no default |
| `AUTO_SOURCE_ALLOWED_ORIGINS` | no default. |
| | |
| `PORT` | default: 3000 |
| `RACK_ENV` | default: 'development' |
| `RACK_TIMEOUT_SERVICE_TIMEOUT` | default: 15 |
| `WEB_CONCURRENCY` | default: 2 |
| `WEB_MAX_THREADS` | default: 5 |

### Runtime monitoring via `GET /health_check.txt`

Expand Down
18 changes: 7 additions & 11 deletions app.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

require 'roda'
require 'rack/cache'

require_relative 'roda/roda_plugins/basic_auth'

module Html2rss
Expand All @@ -12,12 +11,9 @@ module Web
#
# It is built with [Roda](https://roda.jeremyevans.net/).
class App < Roda
# TODO: move to helper
def self.development?
ENV['RACK_ENV'] == 'development'
end
CONTENT_TYPE_RSS = 'application/xml'

def development? = self.class.development?
def self.development? = ENV['RACK_ENV'] == 'development'

opts[:check_dynamic_arity] = false
opts[:check_arity] = :warn
Expand All @@ -33,16 +29,16 @@ def development? = self.class.development?
csp.script_src :self
csp.connect_src :self
csp.img_src :self
csp.font_src :self
csp.font_src :self, 'data:'
csp.form_action :self
csp.base_uri :none
csp.frame_ancestors :none
csp.frame_ancestors :self
csp.frame_src :self
csp.block_all_mixed_content
end

plugin :default_headers,
'Content-Type' => 'text/html',
'X-Frame-Options' => 'deny',
'X-Content-Type-Options' => 'nosniff',
'X-XSS-Protection' => '1; mode=block'

Expand All @@ -53,8 +49,9 @@ def development? = self.class.development?
handle_error(error)
end

plugin :hash_branches
plugin :hash_branch_view_subdir
plugin :public
plugin :content_for
plugin :render, escape: true, layout: 'layout'
plugin :typecast_params
plugin :basic_auth
Expand All @@ -69,7 +66,6 @@ def development? = self.class.development?

route do |r|
r.public

r.hash_branches('')

r.root { view 'index' }
Expand Down
4 changes: 2 additions & 2 deletions config.ru
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ require 'rubygems'
require 'bundler/setup'
require 'rack-timeout'

use Rack::Timeout

dev = ENV.fetch('RACK_ENV', nil) == 'development'

requires = Dir['app/**/*.rb']
Expand All @@ -26,6 +24,8 @@ if dev

run Unreloader
else
use Rack::Timeout

require_relative 'app'
requires.each { |f| require_relative f }

Expand Down
62 changes: 62 additions & 0 deletions helpers/auto_source.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# frozen_string_literal: true

require 'addressable'
require 'base64'
require 'html2rss'
require 'ssrf_filter'

module Html2rss
module Web
##
# Helper methods for handling auto source feature.
class AutoSource
def self.enabled? = ENV['AUTO_SOURCE_ENABLED'].to_s == 'true'
def self.username = ENV.fetch('AUTO_SOURCE_USERNAME')
def self.password = ENV.fetch('AUTO_SOURCE_PASSWORD')

def self.allowed_origins = ENV.fetch('AUTO_SOURCE_ALLOWED_ORIGINS', '')
.split(',')
.map(&:strip)
.reject(&:empty?)
.to_set

# @param encoded_url [String] Base64 encoded URL
# @return [RSS::Rss]
def self.build_auto_source_from_encoded_url(encoded_url)
url = Addressable::URI.parse Base64.urlsafe_decode64(encoded_url)
request = SsrfFilter.get(url)
headers = request.to_hash.transform_values(&:first)

auto_source = Html2rss::AutoSource.new(url, body: request.body, headers:)

auto_source.channel.stylesheets << Html2rss::RssBuilder::Stylesheet.new(href: '/rss.xsl', type: 'text/xsl')

auto_source.build
end

# @param rss [RSS::Rss]
# @param default_in_minutes [Integer]
# @return [Integer]
def self.ttl_in_seconds(rss, default_in_minutes: 60)
(rss&.channel&.ttl || default_in_minutes) * 60
end

# @param request [Roda::RodaRequest]
# @param response [Roda::RodaResponse]
# @param allowed_origins [Set<String>]
def self.check_request_origin!(request, response, allowed_origins = AutoSource.allowed_origins)
if allowed_origins.empty?
response.write 'No allowed origins are configured. Please set AUTO_SOURCE_ALLOWED_ORIGINS.'
else
origin = Set[request.env['HTTP_HOST'], request.env['HTTP_X_FORWARDED_HOST']].delete(nil)
return if allowed_origins.intersect?(origin)

response.write 'Origin is not allowed.'
end

response.status = 403
request.halt
end
end
end
end
11 changes: 10 additions & 1 deletion helpers/error_handlers.rb → helpers/handle_error.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# frozen_string_literal: true

require 'html2rss/configs'
require_relative '../app/local_config'

module Html2rss
module Web
class App
Expand All @@ -15,15 +18,21 @@ def handle_error(error) # rubocop:disable Metrics/MethodLength
when LocalConfig::NotFound,
Html2rss::Configs::ConfigNotFound
set_error_response('Feed config not found', 404)
when Html2rss::Error
set_error_response('Html2rss error', 422)
else
set_error_response('Internal Server Error', 500)
end

@show_backtrace = ENV.fetch('RACK_ENV', nil) == 'development'
@show_backtrace = self.class.development?
@error = error

set_view_subdir nil
view 'error'
end

private

def set_error_response(page_title, status)
@page_title = page_title
response.status = status
Expand Down
2 changes: 1 addition & 1 deletion helpers/handle_html2rss_configs.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ def handle_html2rss_configs(request, _folder_name, _config_name_with_ext)
path = RequestPath.new(request)

Html2rssFacade.from_config(path.full_config_name, typecast_params) do |config|
response['Content-Type'] = 'text/xml'
response['Content-Type'] = CONTENT_TYPE_RSS
HttpCache.expires(response, config.ttl * 60, cache_control: 'public')
end
end
Expand Down
Loading