Skip to content

Commit

Permalink
Add support for custom user_agent to avoid 403 error
Browse files Browse the repository at this point in the history
  • Loading branch information
machsix authored and anatol committed Oct 25, 2022
1 parent 9b3605b commit c19109c
Show file tree
Hide file tree
Showing 4 changed files with 13 additions and 1 deletion.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ repos:
archlinux-reflector:
mirrorlist: /etc/pacman.d/reflector_mirrorlist # Be careful! Check that pacoloco URL is NOT included in that file!
http_proxy: http://foo.company.com:8989
user_agent: Pacoloco/1.2
prefetch: # optional section, add it if you want to enable prefetching
cron: 0 0 3 * * * * # standard cron expression (https://en.wikipedia.org/wiki/Cron#CRON_expression) to define how frequently prefetch, see https://github.com/gorhill/cronexpr#implementation for documentation.
ttl_unaccessed_in_days: 30 # defaults to 30, set it to a higher value than the number of consecutive days you don't update your systems
Expand All @@ -93,6 +94,7 @@ prefetch: # optional section, add it if you want to enable prefetching
* `download_timeout` is a timeout (in seconds) for internet->cache downloads. If a remote server gets slow and file download takes longer than this will be terminated. Default value is `0` that means no timeout.
* `repos` is a list of repositories to mirror. Each repo needs `name` and url of its Arch mirrors. Note that url can be specified either with `url` or `urls` properties, one and only one can be used for each repo configuration.
* `http_proxy` proxy configuration that is used to fetch files from repositories.
* `user_agent` user agent used to fetch the files from repositories. Default value is `Pacoloco/1.2`.
* The `prefetch` section allows to enable packages prefetching. Comment it out to disable it.
* To test out if the cron value does what you'd expect to do, check cronexpr [implementation](https://github.com/gorhill/cronexpr#implementation) or [test it](https://play.golang.org/p/IK2hrIV7tUk)
* For what regards `mirrorlist`, be sure that pacoloco itself is NOT included in the chosen `mirrorlist` file. It can be integrated with reflector too, either by changing reflector's output path or by including pacoloco directly for standard repos in `/etc/pacman.conf` (e.g. adding a `Server=...` entry or a custom mirrorlist file which includes only pacoloco URL).
Expand Down
1 change: 1 addition & 0 deletions config.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ type Config struct {
DownloadTimeout int `yaml:"download_timeout"`
Prefetch *RefreshPeriod `yaml:"prefetch"`
HttpProxy string `yaml:"http_proxy"`
UserAgent string `yaml:"user_agent"`
}

var config *Config
Expand Down
8 changes: 8 additions & 0 deletions pacoloco.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ var mirrorlistRegex *regexp.Regexp // to extract the url from a mirrorlist file
var prefetchDB *gorm.DB
var lastMirrorlistCheck map[string]time.Time // use the file path as a key to get when it had been checked for modifications
var lastModificationTime map[string]time.Time // use the file path as a key to get the last modification time known
var userAgent string

// Accepted formats
var allowedPackagesExtensions []string
Expand Down Expand Up @@ -125,6 +126,11 @@ func main() {
http.DefaultTransport = &http.Transport{Proxy: http.ProxyURL(proxyUrl)}
}

userAgent = "Pacoloco/1.2"
if config.UserAgent != "" {
userAgent = config.UserAgent
}

listenAddr := fmt.Sprintf(":%d", config.Port)
log.Println("Starting server at port", config.Port)
// The request path looks like '/repo/$reponame/$pathatmirror'
Expand Down Expand Up @@ -348,6 +354,7 @@ func downloadFile(url string, filePath string, ifModifiedSince time.Time) (serve
// some servers return compressed data without Content-Length header info
// disable compression as it useless for package data
req.Header.Add("Accept-Encoding", "identity")
req.Header.Set("User-Agent", userAgent)

resp, err := http.DefaultClient.Do(req)
if err != nil {
Expand Down Expand Up @@ -418,6 +425,7 @@ func downloadFileAndSend(url string, filePath string, ifModifiedSince time.Time,
// some servers return compressed data without Content-Length header info
// disable compression as it useless for package data
req.Header.Add("Accept-Encoding", "identity")
req.Header.Set("User-Agent", userAgent)

resp, err := http.DefaultClient.Do(req)
if err != nil {
Expand Down
3 changes: 2 additions & 1 deletion pacoloco.yaml.sample
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,5 @@ prefetch: # optional section, add it if you want to enable prefetching
ttl_unaccessed_in_days: 30 # defaults to 30, set it to a higher value than the number of consecutive days you don't update your systems
# It deletes and stop prefetch packages(and db links) when not downloaded after ttl_unaccessed_in_days days that it had been updated.
ttl_unupdated_in_days: 300 # defaults to 300, it deletes and stop prefetch packages which hadn't been either updated upstream or requested for ttl_unupdated_in_days.
# http_proxy: http://proxy.company.com:8888
# http_proxy: http://proxy.company.com:8888
# user_agent: Pacoloco/1.2

0 comments on commit c19109c

Please sign in to comment.